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STATISTICAL DEFINITIONS OF TEST VALIDITY 
FOR MINORITY GROUPS' 


LLOYD G. HUMPHREYS? 


University of Illinois, Urbana 


This study considers the problem of deciding when a selection test is invalid 
for members of a minority group. There is both a strong empirical and theo- 
retical basis for rejecting the choice of zero correlation between test and cri- 
terion as an appropriate null hypothesis. This choice, for one thing, typically 
requires that the population value of the correlation be higher in the minority 
group than in the majority group. Recommended instead is the direct com- 


parison of correlations in minority and majority samples. Only in the event 
that the minority correlation is significantly lower and the confidence limits 
around that correlation include no useful levels of the relationship should the 
correlation be considered essentially zero. 


Demonstration of the validity of a selection 
test for a minority group is required by the 
Equal Employment Opportunity Commission 
(1970) in its published guidelines, but the 
statistical test for satisfying this requirement 
is not spelled out. As the requirement is fre- 
quently being interpreted, however, the corre- 
lation? in a sample (frequently a very small 
sample and typically a much smaller sample 
than that available for the majority) is com- 
pared with a hypothetical population value of 
zero. This procedure has seemingly been ac- 
cepted without question in spite of the fact 
that it has a very undesirable statistical prop- 
erty and makes psychological assumptions 
that are both theoretically and empirically 
untenable. An alternative statistical test com- 


1'The writer acknowledges with thanks the support 
of the Research Board of the Urbana campus of the 
University of Illinois in the preparation of this manu- 
script. 

? Requests for reprints should be sent to Lloyd G. 
Humphreys, Department of Psychology, University 
of Illinois, 425 Psychology Building, Champaign, Tli- 
nois 61820. 

8 The entire discussion has been phrased in terms 
of correlational analysis rather than the preferred 
regression analysis only because this is the way in 
which these problems have been generally discussed. 
The writer has for many years advocated (Hum- 
phreys, 1952) comparison of both slopes and inter- 
cepts of regression lines in the groups of interest to 
the investigator. If correlations in two groups are 
comparable, slopes of regression lines will probably be 
comparable. Comparison of intercepts, or amount of 
separation between the two regression lines, answers 
a very different and more important question but 


one not considered in this paper. 


paring the correlation in the minority sample 
with the correlation in the largest possible 
sample available for the majority group avoids 
the undesirable statistical property, has more 
tenable psychological assumptions, and meets 
the intent of the regulation. 


TESTS OF DIFFERENTIAL VALIDITY 


The different statistical hypotheses de- 
scribed above are the two commonly reported 
ways of testing differential validity in two or 
more groups; that is, correlations in each 
group are either independently compared with 
zero or compared directly with each other. 
The two approaches are sometimes character- 
ized as differing in rigor, with the second be- 
ing the more rigorous test of differential valid- 
ity; but in point of fact, they answer dif- 
ferent research questions. They also have 
markedly different properties. Using the first, 
all that is needed to claim differential validity 
is a small enough W, while using the second 
requires an N of sufficient size to distinguish 
between sample correlations from two popu- 
lations in which the correlations are probably 
very similar. From this relationship with V 
springs the characterization of a difference in 
rigor, but the difference between the two sta- 
tistical tests is a difference in kind. The re- 
porting of separate / tests comparing sample 
correlations with hypothetical population val- 
ues of zero as a way of assessing differential 
validity represents a fundamental error in 
logic. 


When the problem is to determine differen- 
tial validity, the choice of direct comparison 
of sample correlations is unequivocally the 
only choice possible. The logic of this pro- 
cedure is derived directly from the meaning 
of the words in the statement of the problem, 
no more and no less. To determine whether 
two correlations differ from each other, they 
must be compared with each other., In making 
this comparison, allowance must be made for 
the sampling error of each in order to deter- 
mine whether the difference in the samples 
reflects a difference in the populations. The 


reasoning here js precisely equivalent to that 
involved in comparing differences between 
difference. 


s in means. There must be a direct 
comparison in both cases, The fact that a first 
correlation or a first difference in means is 
greater than zero at an appropriate level of 
significance, while a second correlation or a 
second difference in means is not, does not 
allow one to conclude that the first correla- 


tion, or the first. difference in means, is greater 
than the second, 


€ direct comparison of 
sample correlations to the 
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present instanc 
the following 
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ties probably do not belong to the same bio- 
logical species as the majority; but if they 
do, the environmental differences have been 
so profound and have produced such huge 
cultural differences that the same principles of 
human behavior do not apply to both groups. 
While some psychologists may react to this 
analysis as a gross caricature, even though it 
is implicit in their statistical analyses, it is a 
point of view that has been explicitly accepted 
by at least one psychologist. Robert Williams, 
as quoted in a St. Louis Post Dispatch news 
story (Standardized I.Q. Tests, 1971), has 
stated: “You never compare black and white 
people. We are Afro-Americans, and whites 
are either Euro-Americans or Euro-Asians [D- 
A 10]." This is a point of view, however; 
Which the present writer unequivocally rejects 
on both theoretical and empirical grounds. 
Since the number of cases in a minority 
sample is almost always much smaller than 
the number of cases in a majority sample; 
there is also a serious statistical flaw in the 
common procedure, Tn order for the probabili- 
ties to be equal that the correlations in tht 
two samples will both be significant, the pop" 
lation value of the correlation in the minori? 
group must be larger, frequently much large ! 
than the same correlation in the majority 
group. Since this state of affairs is highly im" 
probable, the resulting unequal probability 9 
finding significance has been termed the va 
dator's gamble (Enneis, 1972), The proce 
dure not only forces the e 
typically against 
defending 
also result: 
of valid t 
members 


mployer to gambles 
highly unfavorable odds, ]? 
his employment practices, but ! 
S not infrequently in the rejection 
ests or other selection devices f0" 
of the minority group or groups. 


THE PREFERRED Nutt Hypornesis 


à s e 
A very reasonable psychological rational 
supports 


a near zero difference between ma- 
jority and minority groups in the size of co 
relations between predictors and criteria: 
namely, there is every reason to accept a SIT 
gle biologica] Species for blacks and whites 
and a high degree of cultural similarity 4$ 
Well. While there are obvious environment?" 
differences, these differences are not so fu 
found as lo require different psychologic! 
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principles in the explanation of black and 
white behavior. The two groups use a highly 
similar (if not identical) language, attend 
similar schools, are exposed to similar cur- 
ricula, listen to the same radio programs, look 
at the same television programs, live in the 
same cities, and buy the same commodities, 
etc, Cultural differences are a question of 
degree, not of kind. The most probable re- 
sult, in terms of this reasoning, is comparabil- 
ity of correlations between predictors and cri- 
teria. It is much more probable that levels of 
performance will be affected by the environ- 
mental differences than will the size of the 
correlation between two samples of behavior. 

Perhaps the importance of the distinction 
between correlation and level of performance 
can be reinforced by analysis of the basis for 
correlations between tests and criteria. A mea- 
sure of verbal comprehension is correlated 
with later reading achievement, not because 
the test measures aptitude for reading, but 
because the understanding of the meaning of 
words is common to both, Complex motor 
coordination and spatial orientation tests are 
correlated with later success in pilot training 
because these abilities are common to both 
the tests and the criterion. An ability that has 
not developed, or has not been developed, will 
generally lead both to low test and to low 
criterion performance, while different levels of 
development of abilities among individuals 
produce the observed correlations between 
tests and criteria. 

An even more compelling basis for the null 
hypothesis of zero difference between the sam- 
ple correlations is available in the literature. 
After all, good data are more convincing than 
psychological theory. If one reads only the 
conclusions in published articles and mono- 
graphs, it would seem that differential validity 
occurs with high frequency and that the first 
psychological rationale is more nearly correct. 
If one analyzes the data carefully, however, 
the evidence for a substantial amount of dif- 
ferential validity evaporates. The seeming high 
frequency of its occurrence is an artifact of 
the erroneous statistical logic described ear- 
lier, Well-established cases of substantial dif- 
ferential validity are rare. An a priori expec- 
tation of highly similar validities for majority 


and minority groups is justified by the avail- 
able data. (See Boehm, 1972; Gordon, 1953; 
Stanley, 1971; Temp, 1971.) 

Without theoretical or empirical support, it 
is legitimate to speculate concerning the 
source of the expectation of substantial dif- 
ferential validity. One likely hypothesis is 
wishful thinking. Perhaps more likely is that 
psychologists are wedded to a single, simplis- 
tic, hypothesis-testing attitude toward data 
that is applied automatically without regard to 
its relevance to the problem at hand. Statis- 
tics is a tool useful in improving the accuracy 
of decisions about psychological (and other) 
phenomena, but the psychologist is responsible 
for asking the right questions. 


VALIDITY IN THE Minority Group 


As the final step it is necessary to look at 
the model of hypothesis testing here proposed 
with reference to the intent of the Equal Em- 
ployment Opportunity Commission guidelines 
concerning validity in minority groups. The 
complete model is described and its conse- 
quences outlined. From this, its relevance to 
the guidelines should become clear. 

The test or other predictor in question is 
being used and is valid at a practically useful 
level for the majority group. The number of 
cases in this validation should have been as 
large as possible so that the correlation would 
have maximum sampling stability. The larg- 
est number of cases obtainable is used in a 
validation study for the minority group. The 
majority and minority correlations are then 
compared directly with each other by means 
of the appropriate statistical test. If the dif- 
ference in validity (differential validity) is 
not statistically significant, one concludes that 
the assumption of equivalent validity in the 
minority and majority groups cannot be dis- 
carded; that is, the test is accepted as valid 
for the minority group. 

If the difference between the two correla- 
tions is statistically significant, the test may 
still be useful in the minority group. Even if 
the validity of the test in that group is the 
smaller of the two correlations and its confi- 
dence interval contains population values 
representing useful degrees of validity, the 
test should be continued, This continuation 
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might be on an experimental basis if the con- 
fidence interval also includes zero, but it 
should be on an operational basis if the con- 
fidence interval excludes essentially zero val- 
ues. 

Inability to reject the hypothesis of zero 
population correlation on the basis of a sta- 
tistical test does not mean that the hypothesis 
must or should be accepted. When a substan- 
tial correlation is well established for one 
group, there are strong theoretical and em- 
pirical bases for a high a priori probability of 
nonzero correlation of substantial size in the 
second group. In this circumstance a near 
zero degree of relationship is a low-probabil- 
ity hypothesis. A near zero relationship can 


only be established by obtaining a sample 
correlation of near zero on a large N. The 
confidence limits for the sample correlation 
must exclude population correlations of a 
useful size, 


gamble for the mi 
the validator’s ga 


tial, Tf differential 


validity 
revealed by a cont 


inuing r 


been small. If differential validity is tiny— 
there are strictly speaking no zero differences 
in natural phenomena—the practical effects of 
an assumption of no difference are inconse- 
quential. In place of the current definition, 
therefore, psychologists should establish the 
different and fairer definition of test validity 
for minority groups here described. 
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RACIAL DIFFERENCES IN VALIDITY OF 
EMPLOYMENT TESTS: 
REALITY OR ILLUSION? 


FRANK L. SCHMIDT, JOHN G. BERNER, anv JOHN E. HUNTER 


Michigan State University 


The fit of data in the literature on single-group validity of employment tests 
to a statistical model assuming equal true validities for blacks and whites 
was tested. For both subjective and objective criterion measures, observed 
frequencies of both kinds of single-group validity (significant for whites but not 
for blacks and vice versa) were not significantly different from those predicted 
by the null differences model. These findings cast serious doubt on the exis- 
tence of single-group and differential validity as substantive phenomena. It 
was concluded that psychologists concerned with the applicability of employ- 
ment tests to minority groups should direct their future efforts to the study 
and determination of test fairness rather than to the pseudoproblem of racial 


differences in test validity. 


The possibility that employment tests and 
other selection. devices might be unfair to, 
and/or inappropriate for, blacks and other 
minority group members has caused great 
concern in recent years. Although it is often 
not made clear, there are two different and 
distinct phenomena involved in this area: 
(a) test fairness and (5) validity differences. 
Although there is disagreement as to the defi- 
nition of test fairness (Cleary, 1968; Darling- 
ton, 1971; Thorndike, 1971), no one disputes 
the proposition that subgroup differences in 
validity coefficients are a separate phenome- 
non. 

Boehm (1972) distinguishes between two 
kinds of validity differences: (a) differential 
validity, in which there is a significant differ- 
ence between the validity coefficients obtained 
for the two ethnic groups and one or both 
coefficients are significantly different from 
zero; and (5) single-group validity, in which 
the obtained validity is significantly different 
from zero for one group only and there is no 
significant difference between the two coeffi- 
cients. In her review, Boehm (1972) noted 
that the frequency of single-group validity was 
about five times that of differential validity. 

The purpose of the present study is to de- 
termine whether à statistical model of the 
validation situation which assumes there are 


1 Requests for reprints should be sent to Frank 
L. Schmidt, Department of Psychology, Michigan 
State University, East Lansing, Michigan 48823. 


tn 


no true differences between blacks and whites 
in validity of employment tests can adequately 
account for the findings of single-group valid- 
ity reported in the literature. There are a 
number of a priori reasons for hypothesizing 
that such a model may fit: (a) small sample 
sizes, especially for blacks, are quite common 
in these studies; (5) large differences be- 
tween sample sizes for the two races are quite 
common, greatly increasing probabilities of 
single-group validity; and (c) failure to cross- 
validate (after ex post facto identification of © 
single-group validities) is apparently a uni- 
versal practice.? 1 
It has been suggested (Boehm, 1972; Bray 

& Moses, 1972) that findings of validity dif- 
ferences by race are associated with the use 
of subjective criteria, such as ratings, rank- 
ings, grades, etc., and that validity differences 
seldom occur when more objective criteria are 
employed. A second purpose of the present 
study is to test this hypothesis by examining 
the fit of the null differences model separately. 
for the two kinds of criteria. In essence, this 
hypothesis predicts that, using objective cri- 
teria, the model will fit validity data gener- 
ated, but that the frequency of single-group 


2 Of the 19 researches on validity differences Te- 
viewed for this study, none employed cross-valida- 
lion. This is true despite the fact that many studies 
involved ex post facto examination of large numbers 
of pairs of coefficients. Farr, O'Leary, and Bartlett 
(1971), for example, reported 80 pairs of coefficients. 
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validity with subjective criteria will be higher 
than can be accounted for by chance. : 
The model to be tested is illustrated in Fig- 
ure 1. Validity coefficients are in the form of 
Fisher’s Z, which has an approximately nor- 
mal distribution for all values of r. SD; = 
1 VN — 3, which for the white sample of 100 
is .10 and for the black sample of 20 is .24. In 
this hypothetical, but perhaps typical, situa- 
tion the true validity for both races is 34 (Z 
= 55). If both these distributions are trans- 
formed to unit normal distributions [N(0,1)] 
the probability of obtaining a nonsignificant 
validity coefficient in the white sample is the 
area a (approximately 07) and the probabil- 
ity of a significant Coefficient is the area 1 — a 
(here approximately 93). The minimum 
Fischer Z validity needed for significance at 
the .05 level when V = 100 is .20. In the 
black sample, area b represents the probabil- 
ity of a nonsignificant coefficient (.69 here), 
and 1 — b or 31 js the probability of a sig- 
nificant coefficient. When N = 20, the mini- 


mum Fischer Z needed for Significance at the 
-05 level is .47, 


Now, since outcome: 
tions are independent 


E 


One for whites 


PLN lis 1 =a) or (.69) 


tobability that 


h Significance [2(W,, 
DD (.93) (31). Va- 
nonsignificant for bot 
P(Wns, By) with 

(.69), and the Probabili 
not the white sq 


Coefficient [p(w ns B, 


Fishers 2 Transformation of 


Fic. 1. Stati 
true validity 


Volidities 


stical mode} of biraci idati 
for both races js pee ma Ys 


sher’s Z = 35), 


(31). The sum of these four joint probabili- 
ties is 1.00. It should be noted that, e 
though true validity is the same for Ww 
races, the probability of nonsignificance or 
blacks and significance for whites is much 
greater than the reverse outcome (.64 vs. .02). 
This model takes into account not only p. 
size of each sample, but the difference in size 


between samples and the overall level of va- 
lidity. 


METHOD 


Nineteen studies reporting employment test er 
ties separately by race were found; these stu "m 
included 86 different predictors and 74 different pi 
terion measures. A total of 410 pairs of validity a 
efficients and sample sizes were reported. For os 
of these pairs, both validities were converted to 4 
Fischer Z and average 
the true validity, 
in Figure 1 using 


d to provide an estimate Ez 
Application of the model 4 - 
& computer program 5 written c 

this purpose yielded estimates of the probabilities 9 
each of the four possible validation outcomes d 
Bs; W,, B, W, Bns; and B,, Wns) for each of E 
410 data sets, Probabilities of each outcome we 

summed across data sets to provide the expecte 


ies 
irequency of each outcome. Observed frequen 
were then tested against those expected using C 
square. This an 


alysis was then done separately f 
the 161 and 249 data sets involving subjective las 
objective criterion measures, respecti All eo 
ings, rankings, etc., and grades in training. (when i 
based on performance meas res) were considere! E 
subjective criteria ; performance measures such yes 
quality and quantity of output, job sample mens 
of proficiency, errors, attendance, tenure, written J z 
knowledge tests, and the like were considered po 
jective criteria, Although they depend to some ee 
€ evaluations, salary level and P 
© Considered to be closer to object 


a mE : t 
than subjective Criteria and were classified a5 
former, 


RESULTS anp Discussion 


Table 1 shows the distribution of the fouls 
validation outcomes for each of the 19 studies 
reviewed, the observed totals for each TA 
come, and the expected frequency for oe 
Outcome as predicted by the null model. T A 
chi-square of 1.39 does not even approac 
oe aa 


Average s 
ies review, 


be- 
» all validities significant at or 


ne 05 were tabulated as significant. 
Copies o 


: : the 
: f this program are available from 
senior author, 
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TABLE 1 


OBSERVED AND MOoDEL-PREDICTED VALIDATION OUTCOMES IN 


19 STUDIES oF RACIAL DIFFERENCES IN VALIDITY 


Study Both ns | Both s Wo Bae Was Bs 
Campbell, Pike, & Flaugher (1969) 0 8 0 0 
Campion & Freihoff (1970) 5 0 1 4 
Farr (1971) | 
Study 1 1 4 0 
Study 2 79 0 7 4 
Farr, O'Leary, & Bartlett (1971) 
Study 1 17 0 2 3 
Study 2 28 7 22 3 
Flaugher, Campbell, & Pike (1969) 16 9 3 8 
Gael & Grant (1972) 14 6 14 1 
Grant & Bray (1970) | 0 8 0 0 
Kirkpatrick, Ewen, Barrett, & Katzell 
(1968) 
Study 1 23 0 1 0 
Study 2 22 0 4 9 
Study 3* 2 1 0 3 
Study 5 2 8 6 0 
Lopez (1966) 8 3 4 1 
Mitchell, Albright, & McMurry (1968) 2 0 0 0 
Ruda & Albright (1968) 0 1 1 0 
Wollowick, Greenwood, & McNamara 
(1969) 15 3 5 1 
Wood (1969) 7 2 1 0 
U. S. Department of Labor (1969)^ 3 1 0 5 
| 
Observed totals? 57 (12.96%) 75 (18.3%) 34 (8.3%) 
"Totals predicted from model" 233.8 (57.0%) | 63.4 (15.5%) 15.5 (18.46%) 37.3 (9.1%) 


i 
includes one America 
ex? = 1.39, ns (p > .80). 


significance, and for each of the four valida- 
tion outcomes the predicted frequency is quite 
close to the observed frequency." The separate 
analyses for validities computed on objective 
and subjective criteria are shown in Table 2. 
As predicted, the model-generated predictions 
are not significantly different from observed 
frequencies for the data based on objective 


6 The relatively high frequency (observed and 
predicted) of lack of validity for both races is prob- 
ably due in part to direct restriction in range on 
predictor scores; in several of the studies, some of 
the predictors had previously been employed as part 
of the selection procedure for the job in question 
(Kirkpatrick, Ewen, Barrett, & Katzell, 1968, Stud- 
ies 1 and 2; Lopez, 1966; Mitchell, Albright, & Mc- 
Murry, 1968; Ruda & Albright, 1968). In addition, 
indirect restriction in range (Thorndike, 1949, pp. 
169-176) was probably a factor in these and other 
studies, inasmuch as incumbents serving as subjects 
were probably selected at least partly on tests cor- 
related with the predictors investigated. 


cant, s = significant, W = white, E ack, 
were excluded for the purposes of this analysis, 


criteria; however, the hypothesis that ob- 
served single-group validity would be more 
frequent with subjective criteria than would 
be predicted by the null model was not borne 
out. The slight trend in the direction of the 
hypothesis is neither statistically nor practi- 
cally significant. g 

. The proportion of validity pairs showing 
single-group validity was significantly higher 
for subjective than objective criterion mea- 
sures (.37 vs. .20, p < .001). But this differ- 
ence is apparently due to differences between 
the two data sets in individual sample sizes, 
differences between black and white sample 
sizes, and general level of validity—factors 
which the model takes into account—rather 
than to differences intrinsic to the two kinds 
of criterion measures. The conclusion indi- 
cated is that when the extrinsic factors are 
controlled, ratings, rankings, and other sub- 
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TABLE 2 . 
r. SD SUBJECTIVE CRITERIA 
OBSERVED AND PREDICTED VALIDATION OUTCOMES FOR OBJECTIVE AND SUBJECTIVE RI ——— 
DSERVED A f i 
| | wip MEG Total 
hs 5; Bn ney 1 
Validation outcome | Both ns | iid s 
7 : Objective criteria" a 
» = $ PA "m k : = ig go; 249 
5.1%, 7 ^t 33 (13.3%) 17 (6.8 ,) 
Observed 162 (65.1%) 37 (14.9 D 33 m 3% esi 240 
Predicted | 155.0 (62.3%) 34.2 (13.7%) 36.5 (14.7%) | 23. M 
4 N Subjective criteria» ` 
M y » "o E "à - = 161 
50.9% A 2 (26.1%) 17 (10.56%) j 
bserved | 82 (50.9%) 20 (12.4%) 4 i de 16 
EUER | 78.8 (48.9%) | 29.2 (18.16;) 39.0 (24.2°7) 140 (8.7% 0) 
| 


Note. ns = non significant, s = significant, W 
2 


bx? = 2,59, ns (p 5:50); 
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The employment problems of racial and 
ethnic minority sroup members do not end 
when integration in a job setting is achieved. 
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analyzed 85 black men’s responses to à A 
item questionnaire designed according to 20 
Herzberg, Mausner, and Snyderman ip 
two-factor theory. The results were compa 
to an earlier analysis of the same ques 
naire filled in by white blue-collar worka 
They concluded that hygiene items were er 
important to blacks, although their analY? 
does not support this conclusion, sas a 
Maslow’s (1954) theory would predict a 
lar results, if one grants the assumption jr 
blacks in general, and unemployed Dae 
Particular, are relatively more deprive n 
lower level satisfactions than whites, we: 
thus should value them more highly, Ot m- 
things being equal. Tt is also true that ut 
ployed people of any race should show ‘in 
effect and that when social class is varied nce 
Contrast to other Studies), a race differ. 
should emerge only to the extent that bla d 
are relatively More deprived than whites k 
the same occupational level, In view of. 
cial discrimination, this iS 
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existence of ra 
likely state, 

It should 
Outcomes of 
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«ved 
also be true that the percelvt 
Work are differentially relate n° 
9n of work in black and/or a 
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Prived. Tf work jg Perceived to lead to = 
order Outcomes, those whose material ne 
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are relatively more satisfied will evaluate it 
more highly. 

Thus, this 
potheses: 

1. The hard-core unemployed of both races 
will evaluate material outcomes more highly 
than the working class, and blacks will tend 
to evaluate such outcomes more highly than 
whites. The black hard-core will evaluate ma- 
terial outcomes most highly. 

2. If the evaluation of each outcome for 
each person is multiplied by its perceived 
association with working at a steady job (see 
Graen, 1969, for a full explanation of the in- 
strumentality concept), material outcomes 
should correlate higher with the directly mea- 
sured evaluation of work in the black and 
hard-core samples, while higher level out- 
comes should correlate highest in the white 
and working-class samples. 


study seeks to test two hy- 


METHOD 

Subjects 

Subjects were black and white males, between the 
ages of 18 and 50, who were living in the St. 
Louis, Missouri, metropolitan area. They were paid 
volunteers, recruited from social service agencies 
(hard core) or from local businesses (working class). 
Fifty-two black working-cl and white hard-core 
men and 48 black hard-core and white working-class 
men filled the cells of a 2 X 2 sampling design. 


Interviewers and Interview 


Interviewers were black and white men, between 
the ages of 20 and 35, who were from the St. Louis 
area. All were high school graduates, either working 
or attending college, and were selected by the author 
or his associate (Director of Psychological Services, 
NASCO West, a private drug rehabilitation agency). 
All interviewers were initially trained by the author, 
although some replacements were later trained by the 
field supervisor, who had been specially trained by 
the author in anticipation of such contingencies. 

Data on job outcome evaluation was collected as 
part of a larger interview taking from 2 to 4 hours 
(Feldman, 1972). Subjects were paid $8 for a com- 
pleted interview and the interviewers were paid $10. 
Interviews were arranged at a time of mutual con- 
venience with interviewers of the subject's own race. 


Instruments 


The evaluation of 15 job 
"The first 5 outcomes, considered a standard list, were 


outcomes was assessed. 


5 The fine work of William Harvey and the staff 
at the Narcotics Service Council (NASCO) West in 
locating subjects and coordinating interviews is 
gratefully acknowledged. 


the paraphrased five factor names of the Job De- 
scription Inventory (Smith, Kendall, & Hulin, 1969), 
a well-developed, standard job satisfaction instru- 
ment. These 5 outcomes were good pay (pay), work- 
ing with people you like (co-workers), having a good 
boss (supervisor), being promoted (promotion), and 
enjoying the work you do on the job (work itself). 

The next 10 outcomes were elicited from a sample 
of black and white, working-class and unemployed 
men that was similar to the present sample. These 
were the 10 most frequent (overall) responses to the 
question “Name three things you feel you get from 
working (at a regular job).” Table 1 presents the 
complete list of outcomes. The evaluations of work- 
ing at a regular job and being unemployed were 
also obtained. Unemployment was defined as not 
having a regular job; thus, doing day work was 
considered to be unemployment. 

Evaluation responses were assessed on the 9-point 
version of Kunin’s Faces Scale (9 = most negative, 
1=most positive) which yields roughly equal- 
interval data (Kunin, 1955). Instrumentality ratings 
were made on a three-choice scale (+1, 0, —1) in 
response to the question, Would you have 
if you were working at a steady job? (4+-1= would, 
0=50-50 chance, —1= would not). The scales, 
which were on flash cards, were presented to the 
subjects, who responded orally. In order to insure 
comprehension, all items were decentered and a 
practice sheet was given before each rating. Subjects 
who did not perform reasonably well on the prac- 
tice sheet (e.g., rating outcomes like “a big raise” 
and “getting hurt in a job accident”)—those who 
could not understand the task or who rated idio- 
syncratically without being able to justify their 
ratings—were excused with token payment.* 

In order to minimize the effects of possible inter- 
viewer bias, interviewers were told the purpose of 
the study only in general terms. They were never 
aware of the specific hypotheses under investigation. 
Interviewers were additionally trained through role 
playing to adopt a neutral tone and manner and 
thus avoid giving nonverbal cues. 

A call-back procedure, in which a randomly se- 
lected 10% of the subjects were called and ques- 
tioned about the interview, further assured the qual- 
ity of the data. 


RESULTS 
Validation of Sampling Strategy 


A series of demographic and job history 
questions asked at the end of the interview 
served to assess the success of the sampling 
strategy. A two-way (Race X Economic 
Class) multivariate analysis of variance on 


3For example, someone who evaluated “being 
hurt” po ely because he saw the possibility of a 
large insurance settlement would be allowed to con- 
tinue. For a full explication of this procedure, see 
Triandis, 1972. 
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job history variables revealed only an eco- 
nomic class main effect (F= 26.51, p< 
.00005). Univariate tests showed that the 
hard-core subjects reported that they received 
more public aid, were employed for shorter 
periods and unemployed for longer periods, 
tended to be laid off jobs rather than quit, 
and received more money per month during 
layoffs than the working-class subjects, (Com- 
plete data may be found in Feldman, 1972). 
A similar analysis, performed on demographic 
variables, showed a significant race main 
effect (F = 6.57, p< 00005), an economic 
class effect (F = 2.21, p < .003), and a sig- 
nificant interaction effect (F = 1.74, p< 
.03). Univariate tests showed that black sub- 
jects tended to have been married longer, 
have more children, have lived in St. Louis 
longer, have parents who worked less often 
and at less skilled jobs, have been poorer as 
children, and have Jess frequently had part- 


time jobs as children than white subjects. 
Hard 


any other 
1972.) 


It may be concluded from th 
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no unusual between, 

eg. a confounding 
class), which might 
tation, 


vement than 
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02; interaction: F = 2.95, p < .0004). n 
Table 1 illustrates, univariate tests showe 
that blacks rated saving money, €. 
things, family respect, friends" respect, o 
supporting yourself as significantly " 
pleasant than did whites and being borec 
significantly less unpleasant. The univaria i 
economic class tests showed that having a 
sponsibilities, saving money, and supporting 
yourself were rated more pleasant by the 
hard core than the working class, while owing 
money and being tired were rated as less ug 
pleasant by the hard core. Univariate € 
tion effects were found for the outcomes E. 
working with people you like, having a goo 
boss, enjoying the work you do, saving mone 
and friends’ respect. In all but one case, 
black working class rated the outcomes 4 
most pleasant and the white working class 
rated them as least pleasant. Having a goa 
boss was rated least pleasant by the black 


-king 
hard core and most by the black worki 
class. 


Race and Economic 


Class Differences in 
Correlates of E 


valuation of Work 7 

As can be seen in Table 2, Hypothesis T 
was not supported, Although differential n 
relations are present, they do not fall E 
the predicted patterns. The black workin? 
class’ evaluation of work seems to be relat 5 
to social, material, and higher order outcome’ 
while the white working-class’ evaluation k 
not clearly related to any outcomes excel | 
that of supporting self and family. The p 
hard core's evaluation is associated with E 
Sponsibility, Support, and saving money, Ér- 
by combining both material and higher of i 
outcomes. The white hard core's evaluatio? d 
Most strongly associated with boredom 4? 


H H ier" 
Support, again in contrast to the need ý 
archy prediction, 


s TA 
, One of the Most interesting findings 
Table 2 is the 


differential relationships i 
tween the evaluation of working and not be t 
ing. These are Significantly negatively COM 
lated only in the white samples and positive | 
(though not Significantly) in the black ar h 
Core sample, This seems to suggest a m 
different Conception of work and unempl in 
Various samples and especially 
the black hard-core sample. 


13 


Racr, EMPLOYMENT, AND THE EVALUATION OF WORK 
TABLE 1 
Mean Evarvation Ratincs or 15 Jon OUTCOMES 
Economic class 
Job outcome Hard core Working class 
| 
| : E 
Black | Black | White 
Getting good pay 1.46 1.13 144 
(.82) (.57) (.95) 
Working with people you like" 1.54 | 1.21 1.79 
1.88) | CD (.99) 
Having a good boss? 1.85 1.31 1.73 
(1.14 (.79) (.90) 
Being promoted | 1.32 | 1.19 1.50 
(.97) | (76) (91) 
Enjoying work you do on the job* 1.60 | | 1,29 1.98 
| (1.08) | C84) (1.18) 
Having responsibilities” | 1.94 | 2.50 2.96 
(1.47) (1.95) (1.89) 
Owing money” 6.19 | 7.38 | 7.49 
| (2.62) | (2.23) | (1.78) 
Saving money! 1.48 | 1.33 2.73 
| (.80) | (.94) (1.64) 
Buying nice things (car, television)? | 1.38 | 1.65 245 
(.71) | (1.29) (1.27) 
Being bored* 6.88 | 6.60 7.60 
| (2.10) | (2.63) (1.41) 
Having family's respect* 1.40 | 1.23 2.23 
| (80 — | | (.85) (1:43) 
Having friends’ respect" | 1.63 1.25 2.23 
| (1.02) (.85) (1.39) 
Having fun 1.81 1.58 | 1.98 
| (1.51) (.99) (1.46) 
Being tired" 5.06 6.00 5.58 
(2.62) | | Q3 (1.88) 
Supporting yourself" Se | 1.21 2.33 
(67) | (74) (1.80) 


most posi.iv 
4 or better). 
nain effect (> 
(p < 005 or better). 


1 figures rounded to two de 
i te race mai 


conomic cl 
interaction 


Tn addition to the above analyses, a 2x2 
multivariate analysis of variance (Race X 
Economic Class) was performed on the evalu- 
ation of working and being unemployed. Sig- 
nificant multivariate race, economic class, and 
interaction effects were found (race: F= 
13.63, p < .0001; economic class: F = 5.32, 
p < 006; interaction: F = 4.78, p < .01). 
Table 3 presents cell means and correspond- 
ing univariate tests. 

"The white working class evaluates working 
at a regular job more negatively than any 
other sample, while the black working class 
evaluates it most positively. The blacks in 


,9 = most negative: 


standard deviations are in parentheses, 


< .0» or better). 


general evaluate work more positively and 


unemployment more negatively than the 
whites. 


Discussion 


. Neither the black nor the hard-core sub- 
jects clearly preferred material or lower level 
outcomes; likewise, the white and working- 
class subjects did not prefer higher order 
outcomes. Rather, tbe pattern of outcome 


preference is complex, with both material, so- - 
cial, and higher order outcomes rated high by. 


black and white, working-class and hard-core 
subjects. 
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TABLE 2 


CORRELATION or (EVALUATION 


X INSTRUMENTALITY OF Work 
WITH DIRECTLY MEASURED Ev 


) For 15 OUTCOMES 
ALUATION OF WORK 


Economic class E. 
Job outcome Hard core Working class ES 
| hitos 
Black? White! Black? aid 
| —.29* 
Getting good pay 18 | 24 S os | e 
Working with people you like 24 | —.17 | I Be 
Having a good boss AT —.19 | th 
Being promoted me —.02 poe 
Enjoying work you do on the job .19 —.03 33 
Having Tesponsibilities .61**9* 08 -18 | 
Owing money —.02 18 —)05 
Saving money .30* —.13 | —.19 
Buying nice things (car, television) 25 16 M 
Being bored —.002 .28* —.001 
Having family's respect 222 16 | 30* 
Having friends’ respect 22 —.06 | NE 
Having fun 47 12 .02 
Being tired 03 —.05 26 
upporting yourself ,S5f6ss ,52**9« joven 
Evaluation of unemployment 19 7:325 26 
li l 
^n = 48, 
*5 zs 
E IESOM 
M5 <.01, 
D$ c por. 


The correlational data lik 
Support to the notion of a 


ewise give little 
to a definite 


need hierarchy or 
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Economic class 


Mean Ey. 


Hard core | Working class 


Black White 
Working at a pe 
regular joht.b,e 146 1.85 Len d. 
C75 : z 
Being unemployega Sie on ur M 
: (1.04) | 74 


als; standard 


sitive m EG 
ect (p c opt neg; 


iti f 
ative, 
iie ui oF better) 
Action effect (p “0D y S) 


Ky 
à wor 
Outcomes related to the evaluation of 


nily 
Support for one’s self and faata 
important to all groups. Ther part 
the conclusions of Bloom and imple 
is obvious that a relatively fich t 
© motivator-hygiene pr ip 
to represent differen ass 
veen race and social tiple 
data do seem Mert í 
trawser’s (1972) P t 
rtifled public account ir 
deficiencies reported by pat 
* similar to the outcomes ack 
Valuation of work in a dat? 
ass. However, since the presen vac?! 
à cial class differences as well rg in 
» the current results cannot ard! 
terpreted as Support for a need bier ene 
theory of MOtivation, Rather, such —-— 
* explained by looking more jr i^ 
at the environments of black and white Y nd 
ers, including cultural values, aee 
economic and politica] factors, rather ix 


contradict 
(1967). Tt 
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extending the Protestant Ethic into a uni- 
versal motivational system. 

'These results have two separate sets of im- 
plications. The first relates to studies of job 
satisfaction, It is clear that standard job 
satisfaction instruments will not be valid for 
all people in all situations. Thus, just as selec- 
tion devices should be validated in each situa- 
tion, so should job satisfaction measures. This 
is not to say that a given instrument is never 
valid for a given population; since this data 
refers to a global evaluation of work rather 
than specific jobs, that statement could not 
be made in any case. Rather, it is argued 
that if more information than a global evalu- 
ation of the job is desired, the investigation 
must first be sure that the questions asked are 
relevant to the employee population. 

The second set of implications is more 
broadly relevant. The white working-class 
sample in this study fell consistently below 
the black working class in their evaluation of 
job outcomes and in the evaluation of work 
as directly measured. In addition, no strong 
correlations (and two puzzling negative corre- 
lations) were obtained in that sample. The 
correlations between the evaluation of work 
and unemployment were, however, highest for 
the white working class. 

This may be a reflection of recent social 
and economic trends. The recent high levels of 
unemployment and inflation in the United 
States may very well have influenced the per- 
ceptions of this group in a negative direction. 
Relative to their previous adaptation level, 
work may not provide the security and satis- 
faction it once did. On the other hand, the 
black working class is experiencing a degree of 
upward mobility due to minority hiring pro- 
grams, equal opportunity laws, and the like. 
Thus, relative to their adaptation level, things 
are getting better. 

The white working class also seems to re- 
gard work and unemployment as more nearly 
opposite concepts than do the other samples. 
The black hard core, in fact, seems to regard 
the two as independent, that is, the evalua- 


THE ÉVALUATION OF WORK iS 
tion of work has little implication for the 
evaluation of unemployment. 

This may also be a reflection of experien- 
tial differences. Because blacks are, and have 
been, more often unemployed than whites, the 
evaluation of unemployment for blacks may 
depend on other factors—in each specific in- 
stance, what other opportunities there are to 
make money, and whether welfare payments 
or unemployment compensation is available, 
etc. Blacks may have learned, out of neces- 
sity, ways to alleviate the hardships of unem- 
ployment. Whites, since they have less often 
been unemployed, might not have developed 
the necessary skills. 

Thus, these data strongly suggest that very 
different conceptions of work and unemploy- 
ment exist in different racial and social class 
groups. Future research should seek to further 
delineate such differences and explore their 
theoretical implications. 
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wide variety of outcomes. Graen (1969) has 
shown that the perception of the instrumental- 
ity of good performance for a number of re- 
wards depends upon the actual contingencies 
of reward in the experimental situations. Tri- 
andis, Feldman, and Harvey (1971) observed 
that black hard-core unemployed see weaker 
relationships between behavior and its conse- 
quences than do whites. It is likely that the 
existence of racial discrimination has the 
effect of weakening the contingencies between 
work and its various outcomes, since effort or 
role occupancy does not itself guarantee re- 
ward for the blacks, as it often does for the 
whites. 

Research conducted under the broad head- 
ing of “locus of control” also lends support to 
such an argument. Individuals high in “inter- 
nal control” are generally taken to believe 
that the environment is manipulable through 
their own efforts, while those high on tex- 
ternal control” seem to believe their lives are 
controlled either by chance or by forces out- 
side their influence. It has been found that 
high externals are less likely to attempt social 
action (Gore & Rotter, 1963; Strickland, 
1965) and depend more on “Juck” in experi- 
mental games (Lefcourt & Ladwig, 1965a). 
The fact that blacks, especially the lower 
class, have often been found to be high on 
the external side (Battle & Rotter, 1963; 
Bradford, 1967; Bullough, 1967; Lefton, 
1968) strongly implies that unemployed 
blacks should perceive less connection between 
work and positive and negative outcomes than 
whites. 

The findings of Lefcourt and Ladwig 
(1965b) and Gurin, Gurin, Lao, and Beattie 
(1969), which suggest that the internal-exter- 
nal dimension may well be situation specific 
for persons of different abilities, can be used 
to further refine the predictions. Lefcourt and 
Ladwig have shown that when a person be- 
lieves he possesses task-relevant abilities, he 
performs in an internal manner. Thus, both 
black and white working-class people, who 
have job-relevant skills, should show an inter- 
nal pattern of instrumentality perception. The 
blacks should be somewhat less internal than 
the whites, however, since the factor of racial 
discrimination reduces their control over job 
outcomes. 


In the case of unemployment, somewhat 
different conditions should obtain. It is possi- 
ble that blacks and/or the hard-core unem- 
ployed would have stronger perceptions of 
both the positive and negative instrumentality 
of unemployment for the same set of out- 
comes, since they have more experience of 
unemployment and perhaps possess adaptive 
skills the whites and working class lack. 

It should be emphasized that race and social 
class are not considered to be causal agents; 
rather, they are indicants of differential life 
experiences, educational histories, etc., which 
are thought to be the ultimate source of ob- 
served differences. 


METHOD 
Subjects 


Subjects were 200 black and white employed and 
unemployed males between the ages of 18 and 50. All 
were volunteers, recruited from social service agen- 
cies or business firms in the St. Louis, Missouri, 
metropolitan area. Subjects were paid $8 for partici- 
pation in an interview of 2 to 4 hours in duration 
(see Feldman, 1972), during which time their instru- 
mentality perceptions were assessed. 


Interviewers 


Interviewers were young (aged 20-35) black and 
white males, recruited by an associate of the author 3 
(Director of Psychological Services, NASCO, Inc.) 
and selected after an interview by the author. All 
were either in college or working in the St. Louis 
area. They were paid $10 per completed interview. 

Training in the use of the interview materials was 
conducted by the author; each man received approxi- 
mately 6 hours of familiarization and practice with 
the various scales and questionnaires, as well as 
role playing designed to induce a neutral tone and 
manner in the interviewers and to thus minimize the 
chances of bias or demand characteristic effects. To 
further minimize such effects, interviewers were 
never informed of the specific hypotheses under test. 
Also, interviewers were always of the same race as 
the subject. 


Instrument and Procedure 


As defined by Graen (1969), instrumentality is the 
subject’s perceived correlation between a particular 
state (such as role occupancy) and the receipt of 
particular outcomes. Thus, a subject’s perceived 
instrumentality of work and unemployment was mea- 
sured by a three-choice (+1, 0, —1) scale for each 
of 15 work outcomes. This scale was presented on à 


8 The fine work of William Harvey and the staff at 
the Narcotics Service Council (NASCO) West in 
locating subjects and coordinating interviews is 
gratefully acknowledged. 
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flash card by the interviewer, who read each outcome 
ject. 
roue were obtained in one of two ways. The 
first 5 outcomes were the five factor names of es 
Job Description Inventory (JDI; Smith, AS E 
Hulin, 1969) paraphrased and used as standar 
outcomes. These were good pay, working with peo- 
ple you like, having a good boss, being promoted, 
and enjoying the work you do on the job. The next 
10 outcomes were elicited from a pretest sample of 
black and white employed and unemployed men as 
responses to the question “Name three things you 
think you get from working.” The 10 most frequent 
responses overall were included in the final list, They 
are having responsibilities, owing money, 
money, buying mice things (car, television), being 
bored, having respect from family, having respect 
from friends, having fun, being tired at the end of 
the day, and supporting yourself (and your family). 
All items and instructions were "decentered" prior 
to administration by black consultants. This pro- 
cedure involved translation. into typical "ghetto" 
language and retranslation by a second person into 
standard English until a single version, compre- 
hensible to all subjects, was obtained. (The three- 
choice instrumentality scale was used in preference 
to a more differentiated instrument on the advice of 
these same consultants.) Also, a practice sheet, re- 
quiring subjects to rate very obvious Outcomes of 
Work and unemployment, was administered before 
actual data collection, This reduced Warm-up effects 
and allowed interviewers to check the subject’s un- 
derstanding of the task. Subjects who could not 
comprehend the task Were excused with a small 
Payment (see Triandis, 1972, or Triandis & Malpas, 
1970, for a full explication of the above procedures), 
, Interviewers first gave subjects a general introduc- 
tion to the Purpose of the study. At the appropriate 
time in the interview, the instrumentality concept 
wee thn eas prati Hems alven, Soe 
Nonas P Ed perceived instrumentalities or 
g at a regular, Steady job (at least 35 hours 
ber week) for each of the 15 Outcomes i 
i s and their 
ms of id Same outcomes of not having a 
Eus ob am making a living some other Nay 
Re Presentation was counterbalanced across 
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p < 00005). Univariate tests showed en 
core reported receiving more public m. om 
employed for less time and being Much 
longer, being laid off rather than Seg e 
from jobs, and receiving more money P. 
month during layoffs when compared to 
working class. J 
The same analysis performed on Fe 
graphic data yielded significant nos M 
6.57, p < .00005), economic class (F = pe 
p < .003), and interaction (F = epo dà 
03) effects. Univariate tests showed b E. 
tend to have been married longer than whi i k 
have more children, have lived in St. wo. 
longer, have parents who worked less O as 
and at less skilled jobs, rated themselves, d 
being poorer as children, and had "I. 
quently held part-time jobs as children. wl 
Core, as compared to working class, had time 
skilled parents but more often held part- ie 
jobs. Finally, the white working class po 
higher educational attainment than any ES 
group. (For full details see Feldman, pé 
From the above data, it may be conclu 
that the sampling Strategy was adequate: 
job-history analysis reveals no confor 
between race and social class, while A 
differences observed on demographic vee 
Seem to reflect differences in life history * 
typify black and white populations. NO ich 
usual between-group differences exist W jid- 
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might threaten the internal or external V 
ity of the study. 
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tality relationships between working anc 
15 job outcomes, was not supported. AS 
l shows, blacks generally reported wp 
instrumentality perceptions than whites \ 
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00005). Univariate tests reveal blacks T$ 
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of 15 items by univariate test, reveals a 
greater difference between the black and 
white working-class than between black and 
white hard-core samples, with the black work- 
ing class’ perceptions always being highest. 
Also, the black hard core’s instrumentality 
perceptions are lower than the black working 
class’, while the opposite is true for the white 
sample. 

The second hypothesis, that blacks and/or 
the hard core would have stronger instrumen- 
tality perceptions for a state of unemploy- 
ment, was not supported. As Table 2 illus- 
trates, the black subjects do not have more 
polarized perceptions than the whites; rather, 
there is a clear difference in directionality 
(multivariate race main effect, F = 26.49, p 


TABLE 1 


THE INSTRUMENTALITY OF 
OUTCOMES 


Mean RATINGS OF 
WORK FOR 


Hard core | Working class 


Outcome ——————[——À 
Black | White | Black | White 
1. Good pay* Er .69 88 58 
2. Work with people % 
you liket ™® .60 .56| .98| .50 
3. Having a good | 
boss™»® .58 50 | 1.00 38 


4, Being promoted": .60 
5. Enjoying the work 
you do*« 
6. Having responsi- 
bilities 


7. Owing money* —.13 —.62 — 02 
8. Saving money? 69 388 58 
9, Buying nice things - 

(car, television)* 83 67 Ni .60 
10. Being bored —.60 | —44 | —46 | —.29 
11. Having respect d 

from family*° 88 83 .98 py ll 
12. Having respect 

from friends** St d 96 56 
13. Having fun? li .67 83 54 
14. Being tired at end 

of day* .06 .50 | —.17 .56 


15. Supporting self | 
and family 96 | 


cating positi 
given outcome 
Outcom 
factors; 
sampled. 

5 Significant univz 

b Significant univa 
better). 


nd decimal pla 
cription Invento! 


.03 or better). 
effect 


© Significant univariate interaction effect (P < .05 or better). 


(p «.03 or 


TABLE 2 


MEAN RATINGS OF THE INSTRUMENTALITY OF 
Nor WORKING FoR FIFTEEN OUTCOMES 


| 
Hard core | Working class 


Outcome aa = 
Black | White | Black | White 
| 
1. Good pay*-^* —.04 | —1.00 56 | —.96 
2. Work with people 
you likes" 31 | —.96 .81 | —.92 
3. Having a good 
boss*-b-e 5| —.98 4| —.96 
4. Being promoted*-^«| 00 | —1.00 .67 | —.98 
5. Enjoying the work 
you do*.b.c 13 | —.98 77 | —.85 
6. Having responsi- 
bilities* .52 | —.13 .50 .00 
7. Owing money? —.38 19} —.58 .38 
8. Saving money^-^.* 25| —.79 71 | —.85 
9. Buying nice things 
(car, television)®® $ —.83 .62 | —.69 
10. Being bored* —.25 .63| —.35 EST 
11. Having respect 
from family*:.e -13 21 .88 | —.40 
12. Having respect 
from friends".^.e 67 33 81) —.31 
13. Having fun* 
14. Being tired at end 50) —.13 .69 | —.15 
of day —.15 | —.19 | —.38 | —.28 
15. Supporting self | 
and family?-* | 69 | —.60 
| 


ade on thre cale: +1, 0, —1, indi- 
ative, or no association of not working with 
all numbers rounded to second decimal place. 
ohrased Job I tion Inventory 
ed from subjects similar to those 


effect (p < .0001 or better). 
» economic-class effect (p <.04 or 
better). 

e Significant univariate interaction effect (p < .02 or better). 


< .00001). The black subjects tend to see 
more positive instrumentalities, the whites 
more negative. This is supported by signifi- 
cant univariate race effects for 14 out of the 
15 outcomes. 

; The hypothesis is also contradicted by the 
significant multivariate economic-class effect 
(F = 2.97, p < .0003); generally, the hard 
core’s ratings were less polarized than the 
working class’. Univariate economic-class ef- 
fects occurred on 9 of the 15 outcomes. 

The significant multivariate interaction ef- 
fect (F = 2.89, p < .0005) once again reflects 
a larger difference between black and white 
working class than occurs between the black 
and white hard core. The white working class 
is especially negative on the five JDI out- 
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comes, while the blacks are moderately posi- 
tive. Again, the black hard core's responses 
are lower than the working classes’, while the 
white samples do not appear to differ. Uni- 
variate interaction effects are significant for 9 
of the 15 outcomes, as indicated in Table 2. 


Discussion 


These data, while not supporting the origi- 
nal hypotheses, strongly suggest that race and 
economic class are associated with different 
views of the world of work. It should be noted 
here that the alternative explanation of re- 
sponse biases in one population or the other is 


not tenable, given the differences in polariza- 
tion and directionality evident in Tables 1 
and 2. Another alt 


ernative hypothesis—dif- 
ferential understanding of the task—is un- 
likely, due to the use of practice sheets, which 
allowed the elimin 


ation of that small percent- 
age of the subjects who could not comprehend 
the task, 
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(1973) to explain differences in M 
evaluation may explain these differences. " 
is assumed that because of recent coon 
and political events the white working E. 
has a less secure position than before, d 
loss, compared to a previously high adapt 
level, could account for their relatively les. 
ratings. If at the same time the snc ool à 
especially the working class, experience 
rise in job availability and an increase in J K 
security due to legislative action, their perm 
tions would increase relative to a previo 
low adaptation level. Thus, the work situa e. 
would become more unpredictable for id 
Whites and less unpredictable for the oo 
It might also be argued that the lr E. 
because they have somewhat more. t 
employment and are thus more sophisti’ 
in the work area, do not believe as wae 
in the “myth” that steady work brings BD: 
motion, material gain, etc. However, this “i 
not seem to be as good an explanation oo $ 
adaptation-level hypothesis for two reeni 
First of all, it seems likely that blacks, i 
for years have been discriminated again? yd 
the job as well as in the job market, W 
believe less in the efficacy of work than yas 
White counterparts. A black man wh? pent 
worked all his life without much advanc® io 
would be an unlikely champion of the Ho -]a55 
Alger mytholo: 


dáng ^, 
gy. Second, the workings- js 

blacks most strongly 

associated with positiv 


believe that working oy. 
ve outcomes. If the pard 
perience” hypothesis were valid, the sinc? 
core should express the strongest beliefs, oy” 
they have less experience with steady one 
ment, the 
The differences 


. in the perception oí. us 
instrumentality of unemployment can PO As 
be attributed to differential experienci ach 
Feldman (1973) has speculated, the ptio” 
samples may have a more realistic conce! te 
of unemployment and a wider range of ^ jd 
native means of making a living. AIS ffer 


S di 
Pay, Promotion, ete., may be defined ec a 
saty In black Populations and may be § job? 
obtainable thr 
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and other means outside of regular e ae 
ment. Tn addition, it seems that respect ' pit 
Corded on a different basis in black and 4s 
Samples. The whites seem to see resp? 
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associated with a role—that of worker or 
provider. The blacks may well look at respect 
as dependent entirely upon the person, consid- 
ered as something apart from his role. This is 
logical, in terms of the more frequent unem- 
ployment which has characterized black com- 
munities in the past. 

It is, of course, possible that such percep- 
tions represent a defensive self-deception. The 
blacks may perceive, or report perceiving, that 
unemployment "isn't so bad" because this 
protects them against the loss of status that 
comes with unemployment. While such a re- 
sponse is predictable from dissonance theory 
(Festinger, 1957), this explanation is not as 
parsimonious as the previous one, since it does 
not take other data into account. Feldman 
(1973) has shown that the black working class 
evaluates unemployment lower than all the 
other samples, while evaluating work highest. 
The white working class’ responses are almost 
the opposite—only the white hard core rate 
unemployment less negatively. If a defensive 
distortion of reality were occurring, it ought 
to operate in the evaluative as well as the cog- 
nitive area. 

These data contradict the conclusions of 
Triandis et al. (1971) that “. . . the black 
samples in general, and the hard core in par- 
ticular, see fewer connections between what 
one can do and desirable or undesirable out- 
comes . . . [p. 40].” However, the samples 
employed by Triandis et al. were quite differ- 
ent from the present group. Their hard-core 
sample had a history of deviant behavior, 
including drug abuse; their adolescent black 
and white samples had academic and social 
problems resulting in assignment to a special 
vocational program; but the middle-class 
white sample consisted of female college stu- 
dents. Thus, the samples in their exploratory 
study were considerably more extreme on sev- 
eral economic and social dimensions than are 
the present samples. 

Tt may be concluded, then, that consider- 
able differences exist between race and eco- 
nomic-class groups in their perception of the 
outcomes of work and unemployment. Future 
research should be carried out in two areas. 
First of all, the behavioral consequences of 
such perceptions should be explored. Second, 


Amm BÉ 


21 


theoretically based research aimed at deter- 
mining the environmental causes of such cog- 
nitive differences should be undertaken. 
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Studies concerning the hiring, training, and retaining of the so-called hard-core 
unemployed are reviewed. The evidence indicates that the characteristics of the 
hard-core unemployed—such as age, sex, and marital status—and the char- 
acteristics of the supervisory and counseling roles and their occupants are 
related to turnover. Although training does not seem to affect the propensity 
to terminate, it does have both functional and dysfunctional effects on work 
attitudes. Job structure, pay level, and other organizational variables are re- 


lated to turnover. 


Many organizations are involved in pro- 
grams to hire, train, and retain the so- 
called hard-core unemployed (HCU), and 
recent years have seen an increasing amount 
of research on this problem. The purpose of 
this paper is (a) to provide a conceptual 
framework which serves to organize these re- 
search studies and (5) to evaluate what has 
been learned and what directions for future 
research are needed. 

One hundred and ninety-two articles on 
training or employing the HCU (private 
sector only) were examined; 28% (54) of 
these related to firms! experiences in HCU 
programs. From this group 4496 (24) were se- 
lected on the basis of an empirical criterion, 
that is, they presented some systematic 
analysis between independent variables (e.g., 


1Work on this paper was supported by a grant 
to the first author from the General Electric Founda- 
tion. The review of the literature and initial write-up 
were conducted while the three authors were at the 
Graduate School of Business, University of Chicago 

? Requests for reprints should be sent to Pau 
Goodman, Graduate School of Industrial Adminis- 
tration, | Carnegie-Mellon University, Pittsburgh, 
Pennsylvania 15213. 

3 It is difficult to precisely define the term “HCU” 
as used in these studies because of lack of informa- 
tion. However, a general characterization would be: 
the HCU is a member of a minority group, not a 
regular member of the work force, has less than a 
high school education, is often under 22 and of a 
poverty level specified by the Department of Labor. 
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type of training, individual differences) and 
criterion variables (e.g., turnover). 


CONCEPTUAL FRAMEWORK 


The HCU worker operates in a complex 
social system. The focal organization pro- 
viding the training and job, community or- 
ganizations, government agencies, informal 
peer groups, and the HCU worker’s family 
are all components of this social system that 
bear on the HCU worker's behavior. Within 
each organization there are role relationships 
and other structural properties (e.g., type of 
job available, promotion opportunities, pay 
level) that directly affect the HCU worker’s 
behavior. Recognition of these multiple fac- 
tors seems necessary in order to understand 
the HCU worker. Too often, researchers have 
defined a very limited social system composed 
primarily of the HCU worker, trainer, and 
supervisor (cf. Goodman, 1969). 

A social system implies not only multiple 
variables, but the interdependence of these 
variables. Change in one variable has a com- 
plex effect on the other dimensions. A major 
theme in most HCU studies is that change 
should be focused primarily on the HCU 
worker—that is, how to change him to fit 
(ie., be retained by) the organization. A 
social system model focuses on a broader 
perspective—what changes at the individual, 
organizational, or societal level are necessary 
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to provide employment opportunities for the 
HCU worker. 


An expectancy-performance model may also 
be used in viewing the HCU literature. 


- Basically, this model holds that behavior is 
a product of the expectancies about beh. 


av- 
lor-reward contingencies and the attractive- 


ness of these rewards. High retention rates 
would occur, then, when Workers believe that 
remaining on the job leads to desired rewards, 
Whereas leaving the job does 
studies (cf. Hen 


the relationship between expectancies and 
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program. Greenberg (1968), Gurin (1968), 
Rosen (1969), Shlensky (1970), Lipsky, 
Drotning, and Fottler (1971), Davis, Doyle. 
Joseph, Niles, and Perry (1973), and Kirch- 
ner and Lucas (1972) report a similar rela- 
tionship between age and dropouts during a 
training. program. In terms of our model, 
younger HCU workers probably experience 
greater feelings of distrust toward the focal 
organization (Clark, 1968). Accordingly, they 
Would perceive lower expectancies about the 
likelihood of receiving rewards and would be 
more likely to leave. Older workers probably 
have higher expectancies and a greater desire 
for the rewards (i.e., regular salary) that are 
contingent on attendance, Only Allerhand, 
Friedlander, Malone, Medow, and Rosenberg 
(1969) ionship between age 
n variables. There is 
on the comparability 
th other studies we reviewed 
why the results of Allerhand et 


£l. differ fro 


m the other findings on age, 

Sex 

The evidence indicates that female job 
retention is significantly higher than the 
retention of males (Davis et al. 1973; 
Greenberg, 1968: Gurin, 1968; Shlensky, 
1970). Females are also more 
a job at the c 
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Ompletion of train; 
et al., 1971), 
does not supp 


ort these relationships, 
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bility—is more ambiguous. Gurin (1968), 
Rosen (1969), and Shlensky (1970) did not 
find number of dependents to be a signif- 
cant predictor of HCU behavior. One study 
(Hinrichs, 1970) reports that number of de- 
pendents was positively related to retention, 
but since that study did not control other 
individual characteristic variables (e.g., age), 
its conclusions must be tentatively accepted. 

The findings supporting relationship be- 
tween family responsibilities (e.g., marital 
status, owning a home) and retention reflect 
the greater need for job-related rewards (e.g.. 
money); that is, greater responsibilities de- 
mand greater resources which can be attained 
by job attendance. Following the expectancy 
model, the greater the attraction of rewards 
related to holding a job, the higher the 
retention rates. 


Birthplace 


The birthplace or the geographical area 
where the trainee spent his formative years 
seems related to turnover of HCU workers. 
Higher retention rates were reported for those 
born in the rural South (Quinn et al., 1970: 
Purcell & Cavanagh, 1969) and the West 
Indies or Latin America (Shlensky, 1970) 
as opposed to those from the urban North. 
This relationship seems to parallel findings on 
rural-urban differences (cf. Hulin & Blood, 
1968) which suggest that the value premises 
of rural-born individuals might be more 
congruent with organizational requirements. 


Education 


Evidence on the relationship between edu- 
cation and the criterion variables is mixed. 
Greenberg (1968) and Shlensky (1970) re- 
port significant positive relationships between 
education and job retention; in the latter 
study, the finding holds only for the black 
HCU. Gurin (1968), Quinn et al. (1970), 
Lipsky et al. (1971), and Davis et al. (1973) 
report no relationships for education. Unfor- 
tunately, there is little information in these 
Studies on the distribution of education or 
the relationships between educational attain- 
Ment and job requirements to permit a 
reconciliation of these findings. 
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Previous Job History 


Present job behavior should reflect, to some 
extent, the patterns of past job behavior and 
earnings. Quinn et al. (1970) report that 
terminations were greater (54%) for those 
with more than two jobs in the last 2 years 
as compared with those (25°¢) who held less 
than two jobs in the same time period. Many 
of the other studies (cf. Greenberg, 1968; 
Hinrichs, 1970) report similar relationships. 
It seems that the inability to stay on past 
jobs leads to lower expectancies that rewards 
will follow from job attendance and to lower 
expectancies by the individual that he is 
capable of remaining on jobs. Following our 
model, these lower expectancies should lead to 
lower job retention. 


Personality and Description of Selj 


Researchers interested in explaining HCU 
trainee behavior have examined the role of 
personality. Some studies have used tradi- 
tional measures of personality characteristics, 
while others have employed single-item scales 
to tap specific attitudes and values. In gen- 
eral, the results are not encouraging. Quinn 
et al. (1970) introduced some 21 indexes in 
their study; only two exhibited significant dif- 
ferences in the criterion variables, of which 
one was in the direction opposite from the 
prediction. Frank (1969) used a more exten- 
sive battery of tests and also obtained few 
significant results. Gurin's (1968) analvsis of 
five scales dealing with orientation toward 
work, personal efficacy, and attitudes about 
the Protestant Ethic also did not reveal any 
strong consistent relationships to the criterion. 

Research by Allerhand et al. (1969), 
Hinrichs (1970), and Teahan (1969) indi- 
cates that there may be some relationship 
between personality factors and the criterion 
variables for HCU trainees. Hinrichs (1970) 
reports that trainees who rated their own 
ability as high were more likely to be con- 
sidered highest in performance during a train- 
ing program. Allerhand et al. (1969) report 
that individuals who indicated a strong need 
to be perceived as smart by their boss and 
who perceived themselves as having a high 
level of energy and activity were less likely 
to drop out of a prejob orientation program. 
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Teahan’s (1969) study focused on the time 
Span concept. He indicates that terminators 
from an HCU training program Possessed 
shorter time spans and were less optimistic 
about their future than were those who re- 
mained in training. Data from each of these 
studies seem to indicate that a favorable self- 
image and orientation toward producing posi- 
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Supervisor Role 


A number of studies indicate that the suppor- 
veness of the supervisor affects HCU behav- 
lor. Beatty (1971) reports that consideration 
(measured by the Leadership Behavior De 
scription Questionnaire) was positively non 
related with performance (7 = 38). A further 
analysis, however, indicated that for those 
trainees in the extremes of the distribution of 
performance scores, the relationship with 
consideration was negative. Another important 
finding is that only first-level supervisory be- 
havior, and not second-level supervisory 
behavior, was related to HCU trainees! per- 
formances, Friedlander and Greenberg (1971) 
report a similar positive relationship between 
supervisory Supportiveness and performance. 
Another interesting finding in their analysis 
is that significant discrepancies existed be- 
tween the HCU worker and the supervisor in 
their Perception of the Supportiveness of the 


Organization: that is, HCU trainees defined 
the work gl less supportive. 
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Probably clarifies the expectations about re- 
wards and expected performance. Following 
our model, these conditions seem to lead to 
higher retention and performance. 


Counselor and Trainer Roles 


Unfortunately there are few studies meeting 
our criteria which deal with the effect of the 
counselor-trainee role on HCU trainee behav- 
ior. Quinn et al. (1970) report findings simi- 
lar to their analyses of the first-line super- 
visor—the fairness of treatment by the coun- 
selor during training is positively related to 
job retention. 

Gurin (1968) provides a provocative analy- 
sis of the sources of attractiveness of coun- 
selors and trainers for the HCU trainees. 
Counselors (versus vocational and basic edu- 
cation teachers) were defined as the most 
attractive staff members by the HCU trainees. 
Black counselors, however, were perceived as 
more attractive than white counselors for 
male trainees, while race differences did not 
differentiate the attractiveness of the occu- 
pants of the training roles. This difference in 
preference for black versus white counselors 
may be attributed to the fact that black 
counselors expressed values and beliefs more 
congruent with those of the trainees. How- 
ever, an analysis of trainees’ perceptions 
indicated that they felt black counselors 
stressed. middle-class values more than white 
counselors did. This finding would seem to 
indicate that HCU trainees were more willing 
to accept middle-class socialization attempts 
from a black than a white counselor. Gurin 
confirms this point by indicating that there 
was a positive association (+.27) between 
stressing middle-class values and the attrac- 
tiveness of the counselor for black male 
counselors but no association (—.04) for 
white male counselors. These findings and 
others reported by Gurin are important be- 
cause they indicate that certain combinations 
of race and sex with specific roles have a 
More powerful effect on the socialization of 
the HCU worker. In terms of the model, it 
suggests that these combination effects will 
have a greater impact in changing expectan- 
cies and the attractiveness of rewards and, 
thus, on retention and job performance. 
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Peers in the Work Organization 

Friedlander and Greenberg (1971) report 
that HCU workers’ perceptions of the sup- 
portiveness of their peers and others in the 
organization to new workers was related to 
supervisory ratings of performance. In gen- 
eral, the more supportive the trainee viewed 
his peers and others in the organization, the 
more likely he was to be evaluated by his 
supervisor as competent, congenial, friendly, 
and conscientious, but not necessarily as more 
reliable. Case studies by Campbell (1969) 
and Kirchner and Lucas (1971), as well as 
an experiment by Baron and Bass (1969), 
also point to the importance of peer-group 
relationships. 

Morgan, Blonsky, and Rosen (1970) ex- 
amined the reactions at different levels of the 
existing work force in the firm to a program 
for the HCU. They found a shift from posi- 
tive to neutral feelings at the end of the 
12-week program. Differences in attitudes 
toward the HCU and the program varied in 
terms of the role distance between the trainee 
and the respondent. For example, individuals 
at the vice presidential level showed an in- 
crease in positive attitudes. For foremen and 
the rank-and-file group, there was a tendency 
for positive attitudes to decrease and for 
negative attitudes to increase (p> .01 for 
change in overall attitudes). The modification 
in perceived positive and negative conse- 
quences at different levels probably reflects 
greater realization of problems in dealing 
with HCU workers. The closer one is to the 
day-to-day problems, the more likely it is that 
one's perceptions and attitudes should reflect 
these problems. There are no data in this 
study to indicate the consequences of this 
attitude change on the criterion variables. On 
one hand, the changes might merely reflect 
reality testing—actual experiences and expec- 
tations are more congruent. On the other 
hand, especially at the foreman and rank- 
and-file level, it might lead to less positive 
relationships with the HCU worker. 


Roles Outside the Work Organization 


Some researchers have looked at the social 
context of the HCU’s family and peer group. 
Gurin (1968), for example, found that male 
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- HCU trainees in the lowest earning quarte 
more often came from families (reference is 
to the household of the trainees mothers) 
where a greater percentage of adult males 
were unemployed. Friends of these HCU 
trainees also were more likely to 
ployed. Other findings (cf. Quinn et 
on the characteristi 
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or performance. This conclusion is quite con- 
gruent with our model of HCU behavior. Job 
retention is related to the expectancy that job 
attendance will lead to desired rewards. Al- 
though the training might initially affect these 
expectations, it is the actual work experiences 
which determine the HCU behavior; that is, 
the types and amount of rewards available 
and the frequency of and criteria for their 
allocation determine the expectancies and the 
perceived attractiveness of rewards. These 
factors are quite independent of the training 
experience. 


Counseling 


There are no experimental data on the rela- 
tive effects of different counseling strategies. 
The earlier discussion of the counselor role 
sheds some light on how the demographic 
characteristics of the counselor may influence 
his effectiveness. Several studies (cf. Aller- 
hand et al., 1969; Hearns, 1968; Purcell & 
Webster, 1969; Rutledge & Gass, 1968) indi- 
cate that counseling may contribute to lower 
HCU termination rates. However, it is diffi- 
cult to evaluate the impact of counseling on 
retention, since these studies do not separate 
its effect from other structural dimensions. 

Although there is no evidence supporting 
any significant effects of a particular program 
characteristic (e.g., training), several studies 
(Davis et al., 1973; Janger, 1972; Sedgwicks 
& Bodell, 1972) indicate that the combined 
effects of many program dimensions (e.g.. 
counseling, training, providing transportation) 
increase job retention. The problem with this 
conclusion is that we do not know whether 
other uncontrolled variables might explain this 
relationship, nor do we know the nature of the 
interaction effects. Also, Davis et al. (1973) 
provide a contrasting finding for those consid- 
ering formal, elaborate programs—the more 
visible the program, the higher the absentee- 


ism and turnover. 


ORGANIZATION STRUCTURAL CHARACTERISTICS 


Job Structure 

The nature of the job on which the trainee 
is placed affects his work attitudes and pro- 
Pensity to remain on the job. Quinn et al. 
(1970) identifed four job characteristics 


which seem related to negative job attitudes 
and turnover. The inability to change one's 
job assignment now or in the future was re- 
lated to higher termination rates for the HCU 
worker. Assignment to multiple work stations, 
or not having an idea what their work routine 
would be like, was also positively related to 
turnover. Trainees who did not understand 
some aspects of their job, or how it fit into the 
larger picture, were more likely to terminate 
than those who had a better understanding. 
When job activities were perceived as boring, 
turnover was more likely (6396) than when 
HCU workers did not find their job boring 
(18%); similarly, involuntary terminations 
were negatively related to skill level (Davis et 
al., 1973). In addition, a number of case stud- 
ies (Bonney, 1971; Campbell, 1969; Good- 
man, 1969b) indicate that job status and job 
mobility are positively related to retention 
rates. 
Pay 

Another organizational characteristic which 
bears on HCU workers’ behavior is the pay 
system. Although none of the studies we re- 
viewed examined the effects of different pay 
systems, a number of studies did examine the 
effect of pay levels. In Shlensky's (1970) 
analysis, pay was a major predictor among 
groups (e.g. blacks, young people, and 
males) that were more likely to terminate and 
thus served to reduce the propensity to termi- 
nate in these groups. Pay did not seem to 
relate to turnover for whites and older work- 
ers. Other studies (cf. Allerhand, 1969; Davis 
et al, 1973; Purcell & Cavanagh, 1969) also 
indicate a positive relationship between pay 
and job retention and between pay and com- 
pletion of training (cf. Lipsky et al., 1971). 


Organizational Commitment and Change 


In a few multifirm studies that were re- 
viewed, there is some indication that higher 
commitment (Allerhand et al., 1969; Hearns, 
1968; Janger, 1972), company willingness to 
change policies and procedures (Allerhand et 
al, 1969; Goodman, 1969b; Hearns, 1968; 
Janger, 1972), and more realistic company 
expectations of the HCU (Allerhand et al., 
1969) are associated with higher retention 
rates. 
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Employment Stability, Size, Industry 


Companies with lower turnover rates in 
entry-level jobs seemed to have higher reten- 
tion rates with HCU workers than did other 
firms (Allerhand et al., 1969). Medium-sized 
companies (100-500 employees) seemed to 
retain more HCU workers than did larger or 
smaller firms (Allerhand et al., 1969). Using 
multivariate analysis, Lipsky et al. (1971) 
found that white-collar versus blue-collar jobs 
and jobs in manufacturing versus nonmanu- 
facturing industries were two significant pre- 
dictors of training program completions, 
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relationships between individual variables and 
the criteria are reduced, but not eliminated, 
when organizational variables are entered into 
a regression analysis (cf. Greenberg, 1968; 
Shlensky, 1970). 

Although the relationships are complex, 
both between individual and organizational 
variables and among the individual level 
variables, a number of Observations can be 
drawn from these studies. First, there are 
clearly no simple selection rules, Also, select- 
ing out HCU workers based on the individual- 
difference information would be inappropriate 
given the purpose of HCU programs. Second, 
the design of a program should reflect the dif- 
ferences among the HCU work force. If a firm 
must select HCU workers with heterogeneous 
characteristics, it would seem important to 
design the Program to reflect differences in 
their expectations and preferences for rewards. 
That is, a young unmarried male would re- 
ceive different Program inputs than a married 
female with two children, 

The HCU trainee Operates in a large social 
system with many interconnected role rela- 
tionships. The degree of conflict between the 
HCU trainee and his counselor, 
visor, and peers Clearly can affect his be- 
havior. In one Study there was some indica- 
tion of an interaction effect between the coun- 
selor-trainer role and the similarity of the 
background characteristi 
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Much of the literature on the HCU worker 
focuses on the effectiveness of the different 
training strategies to reorient this individual 
to the world of work. Unfortunately, in the 
studies reviewed, there is no clear indication 
that training significantly affects the turnover 
or performance of the HCU worker. These 
results seem consistent with our model; that 
is, it is unlikely that training would have a 
Major impact on job expectancies and the 
availability and attractiveness of rewards. Our 
conclusion is not that training of HCU work- 
ers should be discontinued. On the other hand, 
large investments in intensive training pro- 
grams may not be warranted. Future studies 
that examine the effects of different training 
combinations such as short prejob orientation 
combined with on-the-job training versus ex- 
tended vestibule training will provide more 
definitive answers to this question. 

Dimensions of the organization such as the 
type of job and pay system affect the HCU 
workers behavior. The HCU workers were 
more likely to terminate from jobs that they 
did not understand or that afforded little op- 
portunity for movement, etc. The implication 
of these findings is that the trainee's behavior 
must be understood within the technological 
in which he operates and that job 
represent à useful strategy in 
affecting. the HCU worker's behavior. The 
level of pay also affects the HCU worker's 
behavior. The data seem to indicate that firms 
with relatively lower wage rates for entry- 
level jobs should avoid HCU programs. 
Higher paying firms, on the other hand, er 
in a position to hire HCU workers who wou € 
otherwise be most likely to leave; that 18, 
there is some evidence that higher pay reduces 
to terminate for those most 
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stems bear on the HCU worker's pro- 
pensity to come to work. Changes in these 
larger institutional forces must be considered 
when analyzing HCU behavior and HCU pro- 
grams. 

The overall theme of this review is that 
multiple variables affect the HCU worker as 
he operates in a complex social system. 
Changes in the behavior of the HCU worker 
are related to changes in the role-, organiza- 
tional-, and societal-level variables. In many 
studies on the HCU worker, there has been 
an unfortunate assumption that the worker 
must be changed to fit the organization. Our 
concept of the complex social system suggests 
changes must occur at all the main levels of 
analysis—that is, individual, role, organiza- 


tional, and societal. 
Two other issues, until now implicit in this 


review, need to be specified. First, designing 
a program to hire and to retain the HCU 
worker is an exercise in decision making. It 
a judgment about the allocation of 


requires 
resources to a variety of options (e.g., type of 
training, counseling, pay). Basically the 


manager is interested in the effect of this 
allocation on the retention (or performance) 
of HCU workers in relationship to the costs 
of this decision. What is surprising is that 
given the large investment of resources by 
many firms in programs for the HCU worker, 
there has been little attempt to collect and 
develop data systems that would provide 
guidance in the design or reevaluation of such 
a program. There are studies cited in this re- 
view that may serve as models for developing 
data systems to aid in decision making about 


HCU programs. Quinn et al. (1970) demon- 
strate how an experimental design may be 
used in evaluating an organiza- 


tion's HCU program. Although their study is 
more elaborate than a firm would undertake, 
their general design could be utilized to evalu- 
ate the contribution of different factors (eg; 
training) to the retention of the HCU worker. 
A different data collection strategy. which in- 
cludes a number of firms in a cross-sectional 
design, is suggested by the Shlensky (1970) 
study. This type of study is valuable since it 
permits the ariables such as 
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which are more amenable to multifirm investi- 
gations. The critical issue, however, is that 
designing programs for the HCU worker Te- 
quires data which indicate the relative im- 
portance of the multiple variables identiñed 
in this review. 

The second issue concerns the role of the 
industrial psychologist and research on the 
HCU worker. Research in this area provides 
a number of opportunities. One can test the- 
ories about work attitudes, motivation, and 
performance. Empirical results from other 
studies can be cross-validated in this popula- 
tion. Psychologists interested in organizational 
change and action research have a “ 
made” laboratory. The research opportunity 
is also unique, since the data bear on an im- 
mediate social issue in our country. What is 
interesting in reviewing the studies in this 
area is that relatively few psychologists have 
become actively involved in research in what 
would seem to be a fertile area. The question 
is, Given an area with good theoretical re- 
search opportunities, one in which managers 
need data that could be gathered by psycholo- 
gists, and one which Concerns a relevant social 
problem, why is there not greater utilization 
of the skills of industrial psychologists? 


ready- 
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In this paper, the Holland occupational classification is applied to a national 
sample of retrospective work histories (W — 973) in order to test (a) the 
predictive efficiency of the classification and (b) the related hypotheses from 


Holland's theory of careers. Analyses 


were performed by organizing and re- 


organizing the work histories according to the classification. The classification 
appears to order lower level occupational histories in an efficient way, well 


beyond chance. Also, all three letters 
predictive validity. The testing of the 
suggests that the theory can be applied 


in the Realistic code appear to have 
hypotheses from the theory of carcers 
to both adult work histories and voca- 


tional choices of high school and college students. 


The scientific study of work histories has 
attracted few researchers because of the diffi- 
culties inherent in recording, organizing, and 
interpreting the large amounts of data in- 
volved in even a small sample of work his- 
tories. The purpose of the present paper is to 
organize and interpret a man's work history 
by applying a theoretical occupational classifi- 
cation. Because the classification was derived 
from a theory of careers and is an instrument 
of the theory, the application of the classifica- 
tion to work histories makes it possible to test 
both the usefulness of the classification in 
ordering work histories and the validity of se- 
lected hypotheses from the theory. 

In this context, the application of the 


classification to work histories has the follow- 
ing objectives: 
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Social Organization of Schools, supported in part as 
a research and development center by funds from 
the U. S. Office of Education, Department of Health, 
Education, and Welfare. The opinions expressed in 
this publication do not necessarily reflect the posi- 
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? Requests for reprints should be sent to John L. 
Holland, Director, Center for Social Organization of 
Schools, Johns Hopkins University, Baltimore, 
Maryland 21218. 

3 The authors wish to thank James S. Coleman, 
John H. Hollifield, James M. McPartland, and 
James M. Richards, Jr. for their skillful editorial 
help. They are also indebted to Mary Cowan Viern- 
stein, Leslie Schnuelle, and Ruth Narot for their 
assistance in processing these data. 


1. It attempts to test the predictive effi- 
ciency of the classification. For example, does 
the occupational category of a man's first job 
predict the category of his next job? His en- 
tire history of jobs? 

2. It attempts to test whether or not men 
in some occupational categories achieve more 
than men in other categories (Holland, 1966b, 
pp. 47-48). Accordingly, the category of a 
man's first job should forecast his eventual 
vocational success. 


The occupational classification comes from 
a theory of six personality types and six model 
environments (Holland, 1959, 1966b, 1973). 
Each type is assumed to function best in 
its corresponding environment, for example, 
Realistic types in Realistic environments and 
Investigative types in Investigative environ- 
ments. The main occupational categories are 
as follows: 


Realistic (laboring, skilled, and technical 
occupations) 
Investigative (scientific occupations) 


Social (educational and social welfare 
occupations) 


Conventional (office and clerical occupa- 
tions) 


Enterprising (sales and managerial occu- 
pations 


Artistic (artistic, literary, and musical 
occupations). 
Each of the main categories has from 6 to 
14 subcategories such as Realistic-Investiga- 
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tive-Social, Realistic-Investigative-Enterpris- 
ing, Realistic-Artistic-Investigative, etc. The 
Subcategories make it possible to characterize 
occupations more precisely and to avoid some 
of the difficulties inherent in typologies. The 
Classification has been empirically defined, 
tested, and revised by Holland and his col- 
leagues (Holland, 1959, 1966a, 1966b: Hol- 
land, Viernstein, Kuo, Karweit, & Blum, 
1970; Holland, Whitney, Cole, & Richards, 
1969; Viernstein, 1972), The development of 
the classification is described by Holland et 
al. (1970) and by Viernstein (1972). 


METHOD 


A national representative sample of retrospective 
work histories for men aged from 30 to 39 in 1968 
was obtained to develop a social accounting system, 
A supplementary sample of black households only 
was also obtained in order to Create. representative 
black and nonblack samples. The target. population 
in this study is the total population of males from 
30 to 39 years of age residing in households in the 
United States. Individuals in the sample were selected 
by stand multistage area probability methods 
The sampling method, the data collection proce: 
and the tape storage techniques have been sum- 
marized by Blum, Karweit, and Ssrenson (1969), 
The present study used the national sample 
(N — 973) from the social accounts program rather 
than the separate representative samples of blacks 
and whites. The national sample was 87.5% white 4 
and 12.5% black. 

The occupational classification in Holland et al. 
(1970) was used to assign Holland codes (three- 
letter codes) to the census codes (three-digit codes) 
for the jobs in each man’s work history. The follow- 
ing revisions and exceptions were made: (a) military 
service was excluded from consideration and (b) 
truck drivers were classified. as Realistic-Conven- 
tional-Enterprising rather than Conventional-Real- 
istic-Enterprising. 


RESULTS 


The following analyses were usually per- 
formed by organizing and reorganizing work 
histories according to Holland’s classification 
and then by testing selected hypotheses from 
his theory of vocational behavior (Holland, 
1966b), 


Stability of Work Histories 


The purpose of these analyses was to show 
that the classification organizes occupations 


“This sample also includes small percentages of 
Indian, Mexican, Chinese, and Japanese men. 


into similar or homogeneous groups. If the 
Classification performs this task well, men in 
the same occupational category should resem- 
ble one another in these ways: (a) They 
Should possess similar personal traits and 
talents. (5) They should possess similar work 
histories or they should move among the same 
or similar occupational categories. In the fol- 
lowing tables it is assumed that the higher 
the predictive validity of the clas fication, the 
more likely it is that the classification orga- 
nizes work histories according to homogeneous 
groups. 

To test this assumption, a man’s first full- 
time job after full-time education and his job 
5 and 10 years later Were categorized into 
Holland's scheme. These Simple analyses of 
the data for the 5-year interval are shown in 
Table 1. Because most jobs fall in the Real- 
istic category (72.7%), only the Realistic 
category is subdivided according to three- 
letter codes or subgroups. The five remaining 
major categories have insufficient Vs to study 
their subgroups in a reliable way. 

Table 1 shows that the category of a man’s 
first job predicts the Category of his job 5 
vears later with marked efficiency. When the 
SIX main categories are considered (the sub- 
Categories of Realistic are treated as one), 
77.396 of the sample falls along the diagonal. 

Because a standard chi-square test was not 
appropriate to use (Table 1 has many cells 
with low expected frequency), a mobility 
index (Rogoff, 1953) was calculated to obtain 
the total expected frequency. for the cells on 
the diagonal. In this index, the expected fre. 
quency for each diagonal cell is calculated by 
multiplying the appropriate row by column 
totals and dividing by the total N. 'This 
expected frequency is 352 or 49.0% of 
the sample, whereas 77.3% is the observed 
percentage. 

This finding is Statistically significant and 
substantial for several reasons: (a) If job 
changes were simply uniform, only one-sixth 
or 16.7% would be expected in the diagonal, 
(5) The finding cannot be attributed to a 
large proportion of men failing to change 
jobs, thereby Producing a high hit rate (only 
18.8% remain in the same Occupation over 
3 years). (c) And finally, the Observed per- 
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TABLE 1 
RELATION OF CATEGORY OF First Jos TO CATEGORY or Jon 5 Years LATER 
Occupational category 5 years later 
First job — n H 
category | | | | » | : «bmi n. 
ad RIE | RIC | RSI | RSE | RSC | REI | RES | RCI | RCS | RCE| I |A | S| E | C | Other | Total 
| | it | i fy ao l 4 
| | | ý 
RIS 8 3 2 1 | 1 | 2 1 2 Y 2 2 8 25 
RIE 4 65 13] 1 15 2 1 1 ois ¢ 3 i x BF 135 
RIC 2 10 23 | 1 7 1 1 |i 4 1 2 6 $ 1 60 
RSI [z i cae an 2 
RSE 5 48 8) 104. 2 2 2} ET 6 236] & i 5 19 13 977 M 
RSC 3 1 1 | | 2 1 7 
——————— 
REI 1 1 1 | 1 2 3 9 
RES 1 2 3 2-1 1 1 | 1 3 1 6 15 
RCI 2 2 g DIU tax. dab d 2 15 
RCS 1 1 1 | 4 1 2 8 
RCE 1 2 7 1 8 1 1 2 8 23 
i = T = | 
I 1 3 1 1 i i 2 7 EET 
A 1 i ! 2 4 3 
El 1 1 2 3 i 3 i 44 
E 2 6 1 1 2 1 à 3 2 M ot 
C ME $ 2 5 1 i 3 * 30 l 55 
Other | 1 1 1 28 3 
Total | 27 130 81 5 152 4 8 7 8 14 48 54 9 51 102 57 157 
` = 


ble d: 
ary servici 


y f°, sample losses occur because many mi 
'ull- 


centage exceeds the base rate of any single 
category in the classification. For example, 
an efficient prediction can be made by pre- 
dicting that everyone will be in the Realistic 
category. The observed percentage (77.3%) 
exceeds this kind of prediction (63.9%), 

The application of these analyses to the 
data for a 10-year interval produces similar 
results. The mobility index yields an expected 
percentage of 55.6%, and the observed per- 
centage of hits is 74.2%, which exceeds the 
base-rate prediction (using the Realistic cate- 
gory) of 66.2%. 

To summarize, the results suggest that the 
first letter of a man’s occupational code has 
pin pues validity. 

0 test the predicti idi 
and third éticos of the ¢ i Teen 
tion, the data were re 


and 10 years. For 5 
percent of hits equals 
square analyses for 
Table 2 are both 


= Investigative, S = Social, E = Enterprising, C = Conventional 

a. The differences in N from 
e, unemployed, unknown, etc, and is not included in the tot 
H en had not held a full-time job for either 5 or 10 years (Tables 1 and 2) 
time education at the time of interview, 


A = Artistic. 
g information 


le to table occur because of m 


'. In addition, 
since their Jast 


$ < .001; for a 10-year interval, x? = 71.03, 
df — 9, p « .001). 

To test the predictive validity of the third 
letter in the classification, Table 3 was pre- 
paired by selecting only the Realistic-Investi- 
gative categories from Table 1, since only 
these categories contained both sufficient sub- 
jects and sufficient variation to make a sta- 
tistical test worthwhile. Table 3 clearly re- 
veals that the third letter of the occupational 
code has predictive validity. The percent hits 
for 5 and 10 years are 43.6% and 39.6%. 
And 3 X 3 chi-square tests are statistically 
significant (for 5 years, x? = 70.19, df = 4, 
p < .001, and for 10 years, y* = 52.57, df= 
4, p < .001). 

Taken together, the results imply that lim- 
ited portions of a man's work history are 
orderly or predictable, that the classification 
is useful for showing the orderliness of work 
histories, and that all three letters of the 
occupational codes possess predictive validity. 

The next step was to apply the classifica- 
tion to every job in each man's work history 
for his entire life—from first. full-time job to 
last job. Table 4 shows that the occupational 
classification orders all job transitions. or 
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TABLE 2 
PREDICTIVE VALUE OF SECOND LETTER iN CLASSIFICATION Cove 
FOR REALISTIC Occupa TIONS ONLY 
| Occupational category 
| 
Classification code | ——Ó [———— c q 
RI RS | RE | RC | Other | ‘Total 
5 years later 
— ^ MEME —— — p-—— c e " ^ 
RI 130 26 5 16 43 | 220 
RS | 67 110 4 | 35 44 | 260 
RE | 5 + 2 3 10 | 24 
RC 11 | 11 2 15 | y | 46 
I I 
10 years later 
RI 119 | 16 | 53 | 217 
RS 104 43 | 48 | 299 
RE | 9 | 2 | 1l | 57 
RC | 17 H 10 | SO 
Note. Abbreviations: R = Realistic, | = erprising, C = Civenna ss — ái 


changes in the same way that the classifica- 
tion orders the single job transitions in Tables 
1-3. In Table 4, 4,566 or 78.696 of the 5,812 
transitions for 757 men are among the same 
major categories ( Realistic, Investigative, 
Artistic, Social, Enterprising, or Conven- 
tional). In contrast, the total expected fre- 
quency for the diagonal cells is 54.5 (E 
7 X c/n for each cell). This occurrence implies 
that work histories are typified by job move- 
ment within the same category. 

Tn theoretical terms, a man's initial occupa- 
tional code is a useful index of his personal 
dispositions and talents. Tf a man's code has 
Validity, then it should forecast his asap 
tional movements. If the classification lacks 
the ability to group men and occupations 
according to their psychological similarity, the 
application of the classification would produce 
Only random occupational patterns in sid 
tables, Tn lay terms, the classification captures 
What everyone knows—“people tend to = 
doing the same kinds of things. The ot 8 rc 
task is to develop better schemes for exp d 
ing these regularities in occupational behav = 

Using only men in the Realistic mie 
the data in Table 1 were reorganized to earn 
if consistent Realistic codes ( Realistic-Tnv e 
tigative and  Realistic-Conventional) were 


more stable than inconsistent Codes (Real. 
istic-Social and Realistic-Enterprising). Men 
with consistent codes are assumed to combine 
vocational interests, values, and competencies 
that are Psychologically Consistent or conso- 
nant. For instance, Realistic and Investigative 
are considered Consistent, because both types 


TABLE 3 


Occupational Category 
ea a al Category 
Classifica- Hig 


tion code | - | Tc R 
| Other Total 


5 years later 


RIS s 3 2 12 25 
RIE 4d 65 13 53 135 
RIC 2 10 23 25 60 
10 years later 
i 4 4 12 | 2% 
z 38 11 61 | 
RIC 1 9 22 Se 
Note. Abbreviations: = ditus —— 
5 = Social, F DEM RD qeatisie, 


i I =æ Investigative, 
=( onventiona], 


38 HOLLAND, SÓóRENSON, CLARK, NAFZIGER, BLUM 


TABLE 4 


APPLICATION OF THE CLASSIFICATION TO ALL Jon TRANSITIONS 


| mu | " 
Slain- 5 s SE | RSC | REI | RES | RCI[RCS|RCE| 1 |a| S | E | C | Other| Total 
cation | RIS | RIE | RIGI RSI | RSE | RSC | RE | | | | 

code | | | | | | (d - 
5 7 3 3 2 5 n 3 20 26 6 33 264 
E 2 dio bi 6 167 5 22 8 17 9 50 42 13 15 51 18 194 966 
RIC 3 66 4 5 8 3 6 4 3 19 2 W 26 15 90 469 
i 2 1 JEN: 27 
SE a» out 12 (74 01029 1 20 17 26 (27 w 36 6 dà sg) 54 300 1850 

a ^i 6 1 1 1 2 % 34 4 
RSC 2 @ X g m T gw 1 2 1 3 3 i 0 ££ i 7 
EE 2 11 6 15 1 16 2 1 1 1 1 15 9 19 81 
RGI 4 16 35$ 1 jj 1 1 1 48 2 > 6 » 4d é ) fe i3 
Re 3 5 6 3 2 1 18 4 S E 2 S ü 6l 
RCE 7 30 29 85 5 4 4 1o s 4 3 15 11 60 4385 
MEME 9 1 4 2 2 H3 2 140 21 4 4 — 235 
A 1 2 1 3 2 2 3! 3 0$ 2 55 
i$: 33 9 2 32 2 1 2 i 504 030 43 23 12) 09 ë 3» 
E 7 3 9] 6 yo 1 2 8 6 1 17 3 22 381 24 — 64 606 
Cc 3 27 15 1 32 5 4 4 8 7 5 13 56 128 60 308 
Other 43 221 129 5 243. 11 23 20 no 16 58 38 6 23 80 69 241 997 
Total| 248 1003 466 35 159 28 86 69 123 71 384 281 66 334 728 291 5812 


Note, Abbreviations: R = Realistic, I = Investigative, S = Social, C = Conventional, E = 


have an interest in things rather than people 
and both lack interpersonal competency. In 
contrast, Realistic and Social are considered 
inconsistent, because they represent divergent 
interests and competencies—things versus 
people, and mechanical versus social compe- 


tencies. Table 5 summarizes the results of 


these comparisons for 5- and 10-year inter- 


vals. Both the percentages for the 5-year 
interval (54.5% versus 39.4%) and the per- 
centages for the 10-year interval (49.8% 
versus 29.8%) are statistically significant 


(2 < .001). The data suggest that men with 


| consistent codes are more likely to have stable 


work histories in accordance with the theory 
(Holland, 1966b, pp. 43-44), 


TABLE 
CoNsisrENCY oF Cop 
Jos AND Occup 


= 


E OF First FULL-TimE 
ATIONAL Sj 


. % remaining 
In same category 


Code of first full-time job 


| 5 years | 10 years 

| later | later 
Consistent codes (RIs and RCs) 545 | 49.8 
Inconsistent codes (RSs and REs) ; 


| 94 | 298 


Enterprising, A = Artistic. 


Several additional analyses were performed 
to clarify the results in Table 5. Because the 
positive results may only reflect differences in 
expected frequencies of stability for consistent 
versus inconsistent men, the expected and 
observed frequencies for the subsamples of 
consistent and inconsistent men were calcu- 
lated and tested for significance. For men with 
consistent codes over a 5-year interval, the 
observed percentage is significantly greater 
than the expected percentage (p «€ .05), 
whereas the difference between observed and 
expected percentages for men with inconsis- 
tent codes is not significant. The same analy- 
ses for the 10-year interval reveal no signifi- 
cant differences between the expected and 
observed frequencies for either the consistent 
or the inconsistent samples. Therefore, the 
results in Table 5 cannot be attributed to 
differential rates of stability due to unusual 
sampling. 

One final analysis was performed that ap- 
pears to strengthen the hypothesis that men 
With consistent codes are more stable. Men 
With consistent codes tend to maintain a con- 
sistent code, for 64.7% have consistent codes 
( Realistic-Investigative or Realistic- Conven- 
tional) at the end of 5 years and 62.795 have 
Consistent codes at the end of 10 years. In 
Contrast, men with inconsistent codes (Real- 
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istic-Social or Realistic-Enterprising) are not 
only more unstable after 5 years (42.3%), 
but this instability accelerates. Only 33.4% 
of the inconsistent men had Realistic-Enter- 
prising or Realistic-Social codes at the end 
of 10 years. This final analysis implies that 
the inconsistency of a man's code may have 
à snowballing effect. 


Occupational Achievement 


Several analyses were made to test the 
hypotheses about level of occupational pres- 
tige, income, and education. These hypothe- 
ses were derived from the second statement 
of the theory (Holland, 1966b): 


The level of vocational aspiration is related to the 
personality types. Enterprising, Social, and Artistic 
types . . . have higher aspirations; Conventional, 
Intellectual, and Realistic types tend to underrate 
themselves . . , high educational aspirations will be 
positively associated with the model types in the 
following order: Intellectual, Social, Artistic, Conven- 
tional, Enterprising, and Realistic [pp. 47-48]. 


These hypotheses were applied to the present 
data by assuming that occupational prestige 
and income were equivalent to the National 
Opinion Research Center prestige scale 
and by assuming that educational achieve- 
ment was equivalent to educational aspiration, 
In Table 6, row 1 indicates that the correla- 
tion (rho) between the observed and expected 
rank of mean prestige, derived from the code 
of a man's job 5 and 10 years earlier, equals 
32 and .61, respectively. For this analysis, 
the following a priori code order, ranging 
from high to low expected prestige, was de- 
tived from the hypotheses about vocational 
aspiration cited above (using the abbrevia- 
tions R= Realistic, I = Investigative, E = 
Social, C = Conventional, E = Enterprising, 
and A = Artistic): ES, EA, EC, SE, SA, SI, 
SR, AS, AI, IS, IR, CE, CS, CI, CR, RE, 
RS, RA, RI, and RC. Line 2 shows the same 
analysis for all job transitions and prestige 
(rho = .64). In this instance, the average 
Prestige levels for all occupational codes 
(transitions) has been correlated with the 
expected level of prestige. Line 3 shows the 
identical analysis except that average income 
is Predicted (rho = .50). Line 5 shows that 
the Correlation between the average educa- 
tional Jeye] and the expected ordering of 


TABLE 6 


PREDICTING OCCUPATIONAL PRESTIGE, INCOME, AND 
EDUCATIONAL LEVEL FROM THE HOLLAND 


Jos 


Holland Occupational Code | rho 


1. Prestige 


Job 5 years later 52 
Job 10 years later | .61 
2. Prestige (all transitions) .64 
3. Income (all transitions) | .50 
+. Education (all transitions) 64 


occupational codes (IS, IR, SI, SA, SC, SE, 
SR, AI, AS, AE, CI, CS, CE, CR, ES, EA, 
EC, RI, RS, RA, RC, and RE) derived from 
the hypothesis about educational aspiration 
is .64. 

These results strongly suggest that the 
prestige of a man's job, his income, and his 
level of education can be predicted from the 
Holland codes of his first full-time job or 
from the transitions in his work history. At 
the same time, the use of rho based on means 
inflates the size of the correlations so that the 
true correlations are somewhat lower than 
those in Table 6, 

Other correlational analyses (product- 
moment correlations) reveals that the correla- 
tion between the Consistency of occupational 
codes (Holland, 1966b, D. 44) for all jobs and 
average prestige level is -5O. When a three- 
level estimate of consistency derived from 
more recent work (Holland et al., 1969) is 
applied to the data, the correlation increases 
only slightly (7 = 54), 


Discusston 


One main limitation should be remembered, 
The size of the sample is too small to make 
diagnostic evaluations of subcategories within 
the major categories, Only large scale studies 
will clearly test the efficacy of all categories 
in the classification, 

The present study is only the second time 
the Holland classification has been tested with 
a representative national sample of adult 
males. The results imply that the classifica- 
tion orders lower level occupational histories 
(most jobs in a representative Sample are in 
the Realistic category containing skilled and 
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nskilled jobs) in an effcient way, well 
yeyond chance. In addition, all three letters 
n the Realistic codes appear to have predic- 
ive validity. This finding is especially impor- 
rant since Realistic occupations make up the 
majority of occupations. In principle, the sub- 
categories can be extended from three to six 
letters, but only three are currently used. In 
well-educated populations, the Realistic cate- 
gory is very small and the remaining cate- 
gories are relatively large. The flexibility of 
the classification makes it possible to cope 
with both representative and unrepresentative 
populations. Equally important, the hypo- 
thetical predictions about vocational achieve- 
ment suggest that the theory can be applied 

work histories as well as vocational choices 
oi high school and college students. 

The positive results of the present study 
are consistent with other studies which have 
used the classification. Parsons (1971) has 
recently shown that the application of the 
classification to the work histories of a repre- 
sentative national sample of older men (N= 
5,000), aged 45-59, also produces moderately 
efficient predictions, Holland and Whitney 
(1968) applied the classification to longitu- 
dinal data and obtained unusually efficient 
predictions of the vocational aspirations of 
college students over an 8- to 12-month in- 
terval. For example, 79 
1,359) and 93% 
reported successi 


The practical implicati 
2 ations se 7ser: 
of the classification have additio ee 


that the classification is probably appli 

and valid for a broad range of BBC ge 

the same time, more evidence is needed v At 

individual occupations so that their Fi 

tion can become more precise, a 
The classification is also a tool 

occupational mobility according 


tional evidence 


for Studying 
to the kinds 


of interests, values, and special competencies 
that different kinds of jobs require. The use 
of the classification in conjunction with pres- 
tige or occupational level measures may be 
helpful in clarifying some problems of mobil- 
ity. For example, mobility studies should be 
concerned with a person’s opportunity to 
maximize his talents and interests, as well as 
his opportunity to reach higher levels of in- 
come. From this perspective, poor people suf- 
fer from both the effects of low income and 
the effects of a narrow range of job opportuni- 
ties. If job opportunities are limited to only a 
few kinds, then only a few kinds of talent can 
find expression, and the range of possible 
gratifications from work will also be limited. 
Finally, the classification can be used in 
vocational education, vocational guidance, 
personnel work, and in research as a tool to 
organize occupational data, develop curricular 
clusters, organize career libraries, interpret 
work histories, and facilitate the guidance, 
selection, or placement of students and em- 
ployees. Because the classification is based 
upon a theory which has some positive em- 
pirical support, any occupational data which 
can be reorganized by the classification can be 
interpreted with the aid of the theory. 
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CAREER CHOICES OF MARRIED WOMEN: 
EFFECTS ON CONFLICT, ROLE BEHAVIOR, AND SATISFACTION ! 


DOUGLAS T. HALL? 


York University 


FRANCINE E. GORDON ë 


Stanford University 


Conflicts, pressures, and satisfactions associated with three career options 
availavle to married women were studied. The options are full-time employ- 
ment, part-time employment, and being a full-time housewife. The main 
hypothesis, that satisfaction would be related to the extent to which women 
actually did what they ideally prefer to do, was supported in the case of 
housekeeping and volunteer activities but not for full-time or part-time em- 
ployment. Role involvements and conflicts were generally greater for workers 
than housewives, although full-time workers differed greatly from part-timers 
and were the most satisfied of the three groups. 


A key issue in the concern with equal op- 
portunities for women is the range of choices 
of possible careers and life styles available to 
women. This paper considers the outcomes of 
conflict, coping, and satisfaction resulting from 
three different career choices made by mar- 
ried women: full-time homemaking (with per- 
haps some volunteer or community responsi- 
bilities), part-time employment, and full-time 
employment, 

The main hypothesis predicts that those 
women who are performing activities they 
choose to períorm will be more satisfied than 
women whose roles do not match their pref- 
erences. This hypothesis is consistent with 
studies showing that a woman’s role perfor- 
mance and attitudes are less positive if she 
works out of economic necessity rather than 
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dence of role conflict and related coping was 
expected among employed women than among 
the full-time housewives, since care of the 
home and family is generally the responsibil- 


ity of the wife in our culture, whether she is 
employed or not, 


METHOD 


The initial sample was a group of women on the 
mailing lists of several women’s organizations and 
college alumnae clubs in the New Haven, Connecti- 
cut, arca. Most of these women were college edu- 
cated, a factor related to greater adjustment and 
satisfaction in working (Nyc, 1963; Orden & Brad- 
burn, 1969; Sobol, 1963). Questionnaires were sent 
to 250 women who had previously been invited to 
a l-day symposium on women’s roles in which one 
author participated. A total of 109 usable question- 
naires were received out of the approximately 250 
mailed out. 

The questionnaire covered the following issues to 
be examined here; marital status, present work ac- 
tivities, preferred work activities, present roles, role 
conflicts, and satisfaction. Present work activities 
were measured with the following: 


Please check below the category (or categories) 
which describe your work activities. 

— Full-time housewife 

——Full-time volunteer work 

— Part-time volunteer work 

—Full-time employment 

—Part-time employment 

— Other (Please specify) 


Preferred activities were measured as follows: 
— Full-time housewife 

—Full-time volunteer work 

—— Part-time volunteer work 

—- Full-time employment 
—Part-time employment 
—Other (Please specify) 


ae 
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Roles were measured by asking the person to list the 
roles “which seem most salient or prominent to vou." 
Then the person was asked to "list any conflicts or 
strains you experience or have experienced" between 
roles. Satisfaction was measured as follows: 


Overall, how satisfied do you feel with your ca- 
reer? 
a. Dissatisfied 
b. Neutral: neither diss: 
c. Mildly satisfied 
d. Very satisfied 
e. Extremely satisfied, 


isfied nor satisfied 


A second sample was also drawn in order to pro- 
vide a more clearly identifiable population than the 
first and to replicate and extend the now-versus- 
preferred activity analysis of the first sample. Since 
the first sample was found to have very few full- 
time workers, a more representative sample was 
needed, 

The second sample 


consisted of women from the 


Universi of Connecticut’s graduating classes of 
1948, 1953, 1958, 1963, and 1968. Questionnaires 


were mailed to 450 women, 90 from each class. With 
one follow-up, usable responses were received from 
261. The response rate was roughly the same for each 
class. For nonrespondents, data were not available on 
marital status, work status, or other bac ground 
variables nst which we could test for s mple 
bias. Of the 261 respondents, 229 were ma d or 
were widowed or divorced mothers, The remaining 
women (single women with no children) were not 
included in the present analysis.4 

For the present analysis, the questions used in the 
second questionnaire were similar to those in the 

st, with one exception—happiness—which was 
measured following Gurin, Veroff, and Feld (1960): 


In general, how happy would you say you are? 


(Circle one) 


Very Happy Not Unhappy Very 
Happy Very Unhappy 
Happy 


The correlation between satisfaction and happiness 
Was 73, 


RESULTS 
Conflicts Reported 


Conflict is viewed here as resulting from 
two or more competing pressures (after Hall 
& Lawler, 1971). The conflicts experienced 
by the women in the sample were coded in 
terms of the sources of pressure which pro- 


duced them. 
dca. 


* Data reported elsewhere (Gordon, 1971) indicate 
that single women are less satisfied and happy than 
Married women who work full time. Single women do 
not differ significantly from married women who are 
Part-time employees ‘or full-time housewives. 


The following sources of pressure were 
identified from the questionnaire responses: 
home (e.g., wife, mother, and housekeeper 
roles), nonhome (e.g., employment, volun- 
teer work), and self (e.g., personal desire for 
free time to develop interests, take courses). 
Another factor, time, did not involve any 
particular role, but it was mentioned so fre- 
quently with no further qualification that it 
was also coded. On a sample of 20 question- 
naires, intercoder reliability was .74. The 
equation used to compute reliability was: 

2 X (number of agreements) 
r=- 
(number of units (number of units 
coded by coder 1) + coded by coder 2), 


Also, each questionnaire was coded in terms 
of the presence or nonpresence of conflict, 


Role Activities and Satisfaction 


It was predicted that women performing 
activities they prefer to do will be more satis- 
fied than women who are either (a) perform- 
ing activities that they would not prefer to 
do, given the choice, or (5) not performing 
activities that they would prefer to do, given 
the choice, Mean satisfaction scores from both 
Samples for women in these various conditions 
are shown in Table 1. 

The hypothesis received support for some, 
but not all, types of role activities, The re- 
sults were similar in both samples. Full-time 
housewives who indicated they preferred this 
role were significantly more satisfied than 
those who indicated they would prefer not to 
be full-time housewives (P € 01, both sam- 
ples). A similar significant difference also ex- 
isted in both samples between part-time vol- 
unteers who did, and those who did not 
express a preference for this role (p< 01). 
Furthermore, in the initial sample, those who 
8 part-time volunteer 

More satisfied than 
Such work but were 
hot currently engaged in it (5 < .01). In fact, 
the women preferring and doing part-time 
volunteer work seemed to be the most satisfied 
group in both samples, 


In both samples the hypothesis w 
supported in the case of part 


work were significantly 
those who would prefer 


as not 
- or full-time em. 
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TABLE 1 


SATISFACTION RELATIVE TO WORK ACTIVITIES: PRES 


IT AND PREFERRED 


Now only 


Prefer only Now and prefer 


Sample | 


| 
, Mean " Mean E | Mean 
| No. | satis- 2 satis- | No. 
responses faction responses faciion responses 
| 
Initial | 
Full-time housewife* 25 3.28 0 - | 87 | 3.86 
Part-time volunteer? 24 3.20 9 2.77 38 | 4.00 
Full-time employment 9 | 3.55 5 2.60 | 2 | ' 
Part-time employment 9 | 3.00 37 3.35 24 | 3.54 
Connecticut | | 
Full-time housewife* 68 | 3.45 5 3.60 53 4.03 
Part-time volunteer" 18 | 3.44 22 3.03 19 4.05 
Full-time worker? 28 | 4.00 6 3.16 H | 3.93 
Part-time worker 5 3.20 84 3.63 36 3.63 


a Now and prefer > now only, p < .01, one- 
b Now and prefer > now only, p < .025, 
© Now and prefer > prefer only, p < 
4 Now only > prefer only, p < .01, t 


ployment. For women who preferred working, 
those who were presently working part time 
were not significantly more satisfied than 
those who were not working. Also, women 
who were working and preferring to work 
were not significantly more satisfied than 
those who were working but would prefer not 
to, Furthermore, for part-time work activity 
in both samples, women in the “now only” 
category were less satisfied than those in the 


“ Y 
prefer only" and *now and 


5 prefer" catego- 
ries; 


à in the case of full-time employment, 
Owever, women in the “now only” category 


were more satisfied than those j 
se in 
two categories, yu 


In each sam 
3 ample, women preferri 
doing part-time wor one and 


faction than w 


No. in “now only” 
No. in “now only” and no, in‘ 


‘ tn 
now and prefer” 


Summing across both samples, the percent. 
ent- 


ages who would prefer a different role are: 


Full-time housewife, 93/183 = 51% 
42/9 = 42% 
Full-time employment, 37/83 = 45% 


Part-time employment, 14/74 = 1997. 


Part-time volunteer, 


Despite their low satisfaction scores, the per- 
centage of women who would change roles is 
far lower for part-time workers than for any 
other group." 

Obviously the role dynamics of working 
women are different from those of nonwork- 
ing women. Further, there seem to be impor- 
tant differences between women who work 
full-time and those who work part-time. The 
remaining data were collected in the Con- 
necticut sample to help understand more 
about these differences. 


Profiles of the Three Career Groups 


Respondents in the Connecticut sample 
were divided into three groups: full-time em- 
ployees, part-time employees, and full-time 
housewives, Descriptive data for each of the 
three groups are reported in Table 2, Scores 
for college class were coded as follows: 


en SK 


* The authors are grateful to Donald D. Bowen 
for pointing out these data. 
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TABL 


PROFIL 


S OF THE Ty 


IREE CAREER GROUPS 


| | | ł test 
| Part-time | Full-time | House- N ae —" 
Tieni | workers workers | Wives Nue Ud cd 
| (PTW) | (FTW) | (H) PTW PTW FTW 
(n = 42) | (n = 73) | (n = 114) vs. NS Vs. 
| H FEW | H 
Classe | 2.19 3.23 2.72 | —26** | —4 opre« 247** 
(1.04) (147) | (1.30) | 
Number of roles | 3.21 | 2.95 | 2.76 2.30* 
(1.09) | (1.29) (1.07) 
Proportion reporting conflict | 84 -768 .729 
| (446) | 
Proportion reporting /inie conflicts | | O87 2.69 
| | C284) | | 
Proportion reporting home conflict | | E | -640 2.15* 
| | (499) (482) | 
Proportion reporting xonhome contlicts | ERI | .236 2,579 2.80*** 
| (499) | (427) 
Proportion reporting self contlict | 478 | 25 | 
| (.385) (432) 
Satisfaction 3. | 3.95 | 3.66 | —2.07* 2.03* 
| (.999) (.862) (.998) | 
Happiness 4.09 430 | 422 
| C655) (684) | (:746) 
i I | 
© standard deviations, £ ire two-tailed 


as follows: 1948 =' 1, 


1948 = 1, 1953 = 2, 1958 = 3, 1964 = 4. and 
1968 — 5. 

Full-time workers tended to be significantly 
younger than both part-time workers or 
housewives as determined by college class, 
and part-time workers were the oldest. The 
relative youth of the full-time workers is in 
contrast with the common belief that women 
generally wait until their children are grown 
before they enter the labor force full time. 

Generally, the two groups of working 
Women experienced more conflict than the 
housewives." Both working groups reported 
Significantly more conflicts from nonhome 
Pressures than housewives. Part-time workers 
reported more home-related conflicts than the 


"Naturally the type or quality of the woman's 
job would moderate these and other correlates of 
employment, just as the quality of her family rela- 
tionships moderates the effects of being a full-time 

Ousewife, However, the present focus is on the 
degree of employment—full time, part time, or non- 
employed (Le, housewife)—rather than the nature 
of the work itself. 


19. 2, 1958 


3, 1963 = 4, and 1968 = 5, 


other groups (significantly more thar 
employees), Part-timers also 
steatest number of salient roles 
more than housewives), The full- 
experienced more time conflict t 
two career 


housewives). Des 


H ? 
full-time e reported significantly 
sreater satisfaction than part-time workers 
or housewives, The h 


fewest salient roles, anq they are distinguished 


es of time and nonhome 
incidences of 


1 full-time 
reported the 
(significantly 
time workers 


Sources o f Conflict 


The relationships among the sources of 
conflict were examined next Within each career 
group. The Correlations are presented in Ta. 
ble 3. 

For all three groups, hom 
the most important contrib 
role conflicts, with nonhom 


€ pressures were 
utor to Women's 
e sources next in 


MSS 


TABLE 3 


|ELATIONSHIPS AMONG DIFFERENT TYPES OF PRESSURE 
ror DIFFERENT WORK STATUSES 


Presence 
Pressure of Time Home | Nonhome 
conflict 
Full-time housewives 
| 
Time A8 
Home Ed —45 
E i Jl | AI 
34^ —.0 ao :91** 

Part-time workers 
Time 20 | | 
Home ,80* | —.20 | 
Nonhome 40* | —.02 .30** 
Self p 10 A2 —.19 

Full-time workers 
Time 30% | | 
Home 62** —46 | | 
Nonhome Age —.16 .61** | 
Self ise AT | ao —.05 

-——— 
Nol. n — MM, 42, and 73 for full-time 


e housewives, 
part-time workers, and full-time workers, respectively. t 
* p < .05, two-tailed, SIN TOMES 

** p < 01, two-tailed, 


importance, The main differences between 
groups lay in the other pressures that also 
related to conflict. Full-time workers showed 
the greatest range of pressures, with time 
home, nonhome, and self pressures contribu- 


Doucras T. HALL AND FRANCINE E. GORDON 


TABLE 4 


SATISFACTION AND Happiness RELATIVE TO CONFLICT 


ting to experienced conflict. Part-time workers 
were again at the opposite extreme from full- 
time employees, with only home and nonhome 
pressures related significantly to conflict. For 
housewives, home, nonhome, and self pres- 
sures led to conflict. Furthermore, the conflicts 
of the housewives involved self-pressures to 
a greater extent than did either of the other 
groups. Time pressures tended to be the least 
irequent contributor to overall conflict. 


Sources of Satisfaction and Happiness 


Table 4 shows correlations between satis- 
faction and happiness and the conllict and 
role variables. As one might expect, the inci- 
dence of conflict and pressure tended to relate 
negatively to happiness and satisfaction. For 
part-time workers, however, there were no 
significant relationships between conflict and 
happiness. The only clue about causes of 
positive outcomes for this group is a positive 
correlation (7 = .28) between satisfaction and 
the number of roles the person has. For full- 
time workers, this relationship was negative 
(r = — .21). The difference between these two 
correlations is significant (p< .02). 

Self-pressures were again important for 
only the housewives, since these pressures were 
negatively related to satisfaction and happi- 
ness. Time pressures were again important 
only to the full-time workers, since these pres- 
sures were negatively correlated with satis- 
faction. The incidence of nonhome pressure 


** p < 01, two-tailed. 


T " | ^ | ` " 
(am otal sample Full-time housewives | Part-time workers Full-time workers 
Conflict | | 
Satis- ; Scar | i a ques a 
faction A Satis- | Happi- | Satis- Happi- | Happi- 
^ s | s faction | ness faction ness faction ness 
resence o —2g | _ z d cei ccm 
Time -w |a | =a | aoe | o | -o | aot -7 
Home | —26% | 3 -00 —03 | —.07 —06 | —.2 | —M 
Nonhome | —.01 —03 —.30** 32" 10 00 = 36** | ath 
Self | —d awe 01 Pi au | E 4 A 
No. roles | (06 E —29** es | oF | 08 eae E 
NN. NN w | Ai | 7 $4 | —.07 —.01 | 00 
Nie. A9, 08%, 43, 4n] a aa aM ` | di an =k p a 
respectively. » 22, and 73 for the total € = : = 
* p < 05, two-tailed. al sample, full-ti 


me housewives, part-time workers, and full-time workers: 
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was not related to these outcomes for any 
group or for the total sample. 


Discussion 
Work and the Primacy of the Home 


The hypothesized relationship between a 
woman’s satisfaction and the fit between what 
she prefers to do and what she actually does 
was found to exist for nonemployment activ- 
ities (volunteer and housewife work), but not 
for part-time or full-time employment. This 
result is especially noteworthy in the case of 
part-time work, since more women in the 
sample state a preference for this than for 
any other activity. 

These results suggest first that the career 
choices of the work-oriented married woman 
are more difficult to implement successfully 
than are the choices of home-oriented women. 
Home-related tasks and volunteer activities 
are part of the traditionally accepted roles of 
wife and mother. The woman who by her own 
choice prefers to do these activities will find 
external role support, acceptance, admiration, 
and intrinsic satisfaction for doing them. 
Since employment is outside the traditional 
home roles, the woman preferring to work 
may encounter increased role conflicts, time 
pressure, prejudice, and discrimination when 
she seeks employment. These problems may 
offset the satisfaction which a work-oriented 
woman would otherwise receive by doing what 
she prefers to do. 

This reasoning is supported by the data on 
role conflicts. For all groups, employed or not, 
home pressures were the most important con- 
tributors to experienced conflict, low sati 
faction, and low happiness. The most consis- 
tent combination of pressures was the classic 
home versus nonhome clash, These data sup- 
port the primacy of home-related activities 
for married women, whether they happen to be 
personally oriented toward full-time home 
activities or not. 


Part-Time versus Full-Time Work 


A second conclusion to be drawn from the 
results is that the difference between part-time 
and full-time work is as distinct as that be- 
tween working and not working. 


Full-time workers experienced greater satis- 
faction than women who worked part time. 
Part-time workers had a higher proportion of 
conflicts (particularly of home-related con- 
flicts) and more roles to manage than either 
of the other two groups (significantly more 
than housewives). Indeed, part-time workers 
reported the lowest satisfaction of any women 
doing what they preferred, even though more 
women preferred part-time work than any 
other activity. Examination of the comments 
written in on the questionnaires suggested 
four possible explanations for this, 

First, part-time jobs are often not especially 
challenging and rewarding. A second reason 
why part-time work is not more satisfying is 
that for some women it represents an incom- 
plete resolution of the internal conflict about 
a career, a compromise between working full 
time and not being employed at all, 

A closely related factor is the role overload 
resulting from such a compromise. The ques- 
tionnaires of part-time workers indicated that 
they also performed several other activities 
such as volunteer work and being full-time 
housewives, 

It has also been shown earlier that part- 
time workers have more roles and e 
more conflicts and home pressures than 
women in the other groups. The part-time 
worker, because of the less demanding nature 
of part-time work, may have made fewer role 
reductions than one might expect and is there- 
lore spread very thin, 

A fourth possible explan 
satisfaction of part-time Work is that in addi- 
tion to the role overload of part-time workers, 
they may have also developed less effective 
resources and Strategies for coping with role 
conflicts, This hypothesis was tested by com- 
paring the frequency of different types of 
coping techniques for part-time versus full- 
time workers, using a System developed by 
Hall (1972). No significant differences were 
found. Therefore, insufficient coping does not 
appear to be an important factor in the rela- 
tively low satisfaction expe 
time workers, 

There is some indication that the part- 
time worker is simply a different type of 
person from the full-time worker and that she 


xperience 


ation for the low 


rienced by part- 


48 Dovuctas T. HALL AND FRANCINE E. GORDON 


may not mind role overload. First, in contrast 
to full-time workers and housewives, her 
satisfaction and happiness are not affected 
by pressure or conflict. Even though she is 
less satisfied than women in the other groups, 
she strongly prefers to remain a part-time 
employee. Finally, even though she is already 
suffering from role overload, she reports sig- 
nificantly greater satisfaction from multiple 
roles than does the full-time employee. The 
part-timer’s satisfaction may come from mul- 
tiple involvements—activity qua activity— 
whereas full-time employees and housewives 
seek deeper involvements and achievements 
in a more limited number of activities, 


| Unique Features of Each Group 


Each career group showed distinctive pat- 
terns of role pressures and outcomes. The 
unique characteristic of part-time workers was 
role overload and apparent satisfaction re- 
sulting from overload. Full-time employees 
were the most satisfied group, experienced 
the greatest time pressures, and stood mid- 
way between part-time employees and house- 
Wives in many respects, Housewives were 


unique in the salience of self-pressures as a 
factor in conflicts and satisfaction. 
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INDIVIDUAL DIFFERENCES IN 


THE DECISION 


PROCESS OF EMPLOYMENT INTERVIEWERS 


ENZO VALENZI ! 


University of Rochester 


Four placement interviewers rated 243 secretarial job 
of five kinds of information cues. Despite many 
formity, wide individual differences in cue utilization ]e 


agreement about how many 


Recent research on the employment inter- 
view has focused on how selection interviewers 
make decisions, rather than upon the predic- 
tive validity of the decisions made (e.g., Carl- 
son & Mayfield, 1967; Webster, 1964). Since 
most of these studies grouped data across sub- 
jects to investigate the impact of critical 
variables in the interview situation, individual 
differences in the decision Processes of em- 
ployment interviewers have largely been ig- 
nored. Therefore, the present study was de- 
signed to provide data on individual differ- 
ences in decision processes. 

In a study similar to the present one, Dob- 
meyer (1970) found striking individual dif- 
ferences in the way information cues were 
utilized. An important kind of information 
for one campus recruiter could be completely 
ignored by another recruiter, Similar indi- 
vidual differences in cue utilization have been 
reported for radiologists (Hoffman, Slovic, & 
Rorer, 1968), clinical psychologists (Gold- 
berg, 1968), and stockbrokers (Slovic, 1969). 
Dobmeyer's result was also consistent with 
earlier studies by Wentworth (1953) and 
Webster (1964). 

Dobmeyer (1970) also reported data on 
configural cue use by his 35 campus recruit- 
ers. Consistent with analysis of variance 
(ANOVA) studies on other kinds of decision 


"Requests for reprints should be sent to Enzo Va- 
lenzi, Management Research Center, Graduate School 
of Management, University of Rochester, Rochester, 
New York 14627, d 
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and his actual cue 
a small, but potentially 


makers, configural cue use accounted for only 
a small fraction of the total decision variance 
for the majority of the recruiters. Once again 
there were wide individual differences in cue 
utilization practices, 

Not entirely clear in any of the above stud- 

ies was the degree to which judges were aware 
of their own cue utilization practices, Slovic 
(1969) presented some incidental data which 
Suggested that his stockbrokers did have such 
insights, but the procedures for collecting 
these data were questionable, Lastly, where 
reliability figures have been reported, it seems 
that judges were fairly consistent in applying 
their personal decision models. 
; Previous studies have been quite consistent 
In reporting wide individual differences in 
Cue utilization. practices, However, possible 
reasons for these differences have not been 
explored. The present study attempts to pro- 
vide a partial Step in that direction. 

By Working with four 
drawn from the 
by providing 
cues, several sources of 


vidual differences in 
formation cues were 
favor the detection 


on the basis of fiy 
On the basis of 
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TABLE 1 


Curs anp VALUES or CUE LEVELS 


Cue 
Cue —— 
level Typ- | Short- | Expe- | Education | Social 
ing* | hand? | rience” | skills 
1 |40-45| 60 | 6 |3yrs. high | Below 
| | school | average 
2 | 60-65 80 | 18 | High | Average 
| | school | 
| | graduate 
3 | 80-85| 100 36 |lor2 yrs. | Above 
| | college | average 


s Values in this column are words per minute, 
» Values in this column are months. 


perienced employment interviewers would vary 
substantially, as evidenced by the actual deci- 
sions made, in the importance attached to dif- 
ferent kinds of information about job appli- 
cants. Second, it was anticipated that the four 
interviewers would be quite different in the 
amount and location of configural cue use. 
Third, it was anticipated that each inter- 
viewer would be reliable in applying his own 
decision model, that is, the model derived 
from an analysis of his decision patterns. 

r Also investigated was the degree to which 
interviewers agreed with one another—when 
asked in an interview—about the presumed 
importance of various kinds of information 
about job applicants. Lastly, the degree to 
which interviewers had insights into their own 
cue utilization practices was examined. On 


these last two points n 
ints, no predictio M 
10N; 
made. p S were 


METHOD 
Subjects 


The subjects (judges) were 
interviewers who me tee y nament 
state employment office 6 months or s S 
primary task on their regular jobs is to alee ee 
viduals seeking employment to a job for which ‘th x 
qualify and are willing to accept if it is offered p 
them. The interviewer forms an evaluation of the 
applicant's qualifications on the basis of a standard 
application blank (recorded by trained intake inter- 
viewers) and an interview with the applicant. The 
available jobs are described in job orders containing 
a Dictionary of Occupational Titles classification 
code number, a brief description of the duties and 


responsibilities of the job obtained írom the em- 
ployer, and a statement of the skills, experience, 
education, etc., required. 


‘Procedure 


The job of secretary was chosen because all of the 
interviewers were experienced with that job category 
and because one of our advisors, a placement super- 
visor in a nearby employment office, was especially 
knowledgeable about secretarial positions. With the 
help of other expert judges the placement supervisor 
selected five kinds of information cues that were 
considered essential for the evaluation of persons 
applying for secretarial positions. The advisors also 
helped us select three realistic and appropriate levels 
for each of the five kinds of information cues (see 
Table 1). The end result—each of five cues varied at 
three levels—enabled us to form 243 hypothetical job 
applicants, one for each possible cue combination. 
To add realism, the cue values on the 83 X 5} cards 
that represented job applicants were varied some- 
what around the values shown in Table 1. The 243 
cards created through the above process were repli- 
cated to produce a total of 486 job applicants. The 
four placement interviewers did not know about 
the replication and assumed that they were judging 
486 different job applicants. Using a table of ran- 
dom numbers, the 486 applicants were ordered inde- 
pendently for each judge. This random order was 
modified slightly to eliminate cases where replications 
were close to one another. 

In addition to the job applicant cards, each judge 
received a job order form for a secreta position 
and a page of instructions explaining the rating 
procedure. The judges were instructed to rate each 
applicant on the basis of the information cues pro- 
vided and to indicate their evaluation on a hori- 
zontal scale at the bottom of each applicant card. 
The zero point (0) on the rating scale was labeled 
no chance of being hired, the middle point (5) was 
labeled 50-50 chance of being hired, and the high 
end of the scale (10) was labeled certain to be hired. 

The placement interviewers completed the ratings 
independently in their leisure time. When all the 
judges had completed their ratings, but prior to any 
analysis of the data, each judge was interviewed to 
obtain information on the procedures she used to 
rate applicants. Judges were also asked to indicate 
their perceptions of the relative importance of the 
five kinds of information cues. Each judge received 
an honorarium of $50. 

Jt was discovered in the postexperimental inter- 
view that one of the interviewers (Judge 1) did not 
process cues clinically. Instead she used a hand 
Caledlator to sum the cross products of coded values 
ch cues provided and assigned weights for her 
kinds. of ete Zelative importante ‘of the five 

mation. Though not strictly relevant 

to the purposes of this study, the da E 
1 were included i Od, e data, from: Judge 

ed in several of the results tables to 


provide a comparison agains! e data derived from 

p st th ta deri r 
xd a erly 

judges who processed cues clinically 
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RESULTS 


The decision pattern of each interviewer 
was analyzed with a separate ANOVA. The 
resultant eta-squared values ? shown in Table 
2 revealed the judge's effective cue weight for 
each kind of information cue (main effects) 
and further showed which cues were used con- 
figurally (interaction terms). When the eta- 
squared values in Table 2 are converted to 
percentages, they provide an estimate of the 
amount of decision variance accounted for by 
each main effect and by each two-way inter- 
action. Because they accounted for essentially 
zero decision variance, three-way interactions 
were omitted, 

By looking just at the main effects data in 
Table 2, it is clear that the utilization of the 
five information cues varied considerably 
across judges, For example, social skills was 
the most important cue for Judge 3 and the 
least important cue for Judge 1, while educa- 
tion was most important for Judge 4 and 
much less important for Judge 1. 

As evidenced by the variation in eta-squared 
values for the interaction terms in Table 2, 
there were also substantial individual differ- 
ences in configural cue use. The most config- 
ural interviewer was Judge 4. For this inter- 
viewer, the Education X Social Skills and the 
Shorthand X Education interaction terms pro- 
vide a worthwhile supplement to a strictly 
linear representation of her decision strategy. 
In the case of Judge 2, it was the Typing X 
Social Skills interaction term which might be 
of value in representing her decision processes. 
For Judges 1 and 3, none of the interaction 
terms accounted for much of the decision vari- 
ance, 

While it is clear from the foregoing that 
there were substantial differences in cue use 
among the interviewers, these are of particu- 
lar interest if associated with appreciable dif- 
ferences in judgments. To shed light on this 
question, the job applicant ratings by Judges 
2, 3, and 4 were intercorrelated and corrected 
for attenuation, When these corrected inter- 
Correlations (shown in Table 3) are squared 
and converted to percentages, they indicate 


8 While they are not reported here, the associated 


i values were all significant at the p< 05 level or 
ess, 


TABLE 2 


Era-Squarep VALUES ror Eacir CUE AND 


ACH Two-Way INTERACTION 
Judge 
Cue ee Daas cee eee 
| 
1 2 ig 4 
Typing (A) 23 | 29 P 386 | 68 
Shorthand (B) 24 akt allt 05 
Experience (C) d | .04 .03 ‘00 
ion (D) 46 | 10 | ios | 42 
Social skills (E) 12 | 23 | 4 Ad 
AXB 00 | 02 | .o1 | .o1 
AXC .00 01 00 00 
AXD .00 01 00 01 
AXE .00 03 01 01 
BXC 00 | .00 .00 00 
BXD 00 | O0 | 00 | ‘o3 
BXE 00 OL 01 01 
CXD | -00 | 00 | .00 | ‘oo 
CXE 00 .00 00 00 
DXE .00 .0t 01 09 
Total | 
Main effects 92 -70 | 84 73 
Interaction effects | .00 | .10 | 04 | 16 


between pairs of judges, Subtracting the com- 


yields the percentage 
Ic variance between 


Judges 2 and 3, but was 3096 fo 
g f r Judges 2 
and 4, and 34% for Judges 3 and 4. Thus it 


wers in this 


interest is the degree t 
EX! . . id 
which the Interviewers had accurate insights 


into their own decision Strategies, To obtain 


TABLE 3 


INTER]UDGE CORRELATION M ATRIX 


Judge pes ae Se e 
| 2 3 | 4 
2 | = i E ín 2 
‘ie & = 
= 81 


Note. N = 243; bottom value in each cell js 


ain corrected for 


ZR 4 
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TABLE 4 


PERCEIVED CUE IMPORTANCE COMPARED TO 
EFFECTIVE CUE WEIGHTS 


Judge 
2 3 4 
Cue Hj c! 
Per- Per- Per- 
ceived | n? | ceived | n? | ceived | n? 
rank | rank rank 
| 
Typing 2 |2| 1 |46| 2 | .0g 
Shorthand 3 BH 2 jA 1 .05 
Experience 5 04) 3 | .03 5 .00 
Education 1 0 5 .08 4 42 
Social skills 4 .23 4 Aj 3 317 
| 


data on this question each interviewer was 
asked to rank the five information cues in 
order of their importance for the decisions 
being made. These rankings were obtained in 
an interview held after the interviewer had 
made all of her decisions, but before any of 
the decision patterns had been analyzed. Large 
discrepancies between a judge's perceived 
rankings and the relative size of her et 


a- 
Squared values for the same information cu 


H H n t n ES 
indicate a lack of insight into her own cue 
utilization. As shown jn Table 4, all three 


judges (Judge 1 was omitted for reasons 
stated earlier) had one or more serious dis- 
crepancies between perceived and actual im- 


Ü her actua] 
With the exception 
was 


a fair amount of agre : 

the perceived importance of the i E 
Therefore, it is possible that the interjudzg 
differences in actual cue weights ( Table 2) 
were in large measure due to an in 


r ability to 
process cues according to the jud 


Iges’ own 


perception of the relative importance of the 
five kinds of information. This possibility is 
examined more closely in the discussion sec- 
tion. 

Though the judges lacked insight into their 
own decision practices, they were nonetheless 
quite consistent in the way cues were han- 
dled, As stated earlier, each interviewer rated 
each job applicant twice, thus making it possi- 
ble to calculate a test-retest reliability co- 
efficient. As was true in Hoffman et al.'s 
(1968) study, these coefficients were quite 
high, .82 for Judge 2 and .90 or better for 
the other judges. 


Discussion 


Despite the uniformity of the judgment 
task, the experience and reliability of the 
judges, and the high degree of interjudge 
agreement on the perceived relative impor- 
tance of the five cues, substantially different 
decisions were made about the relative worth 
of many of the job applicants. 

A key source of interjudge disagreement 
was the failure of judges to process cues con- 
sistent with their estimate of perceived rela- 
tive importance. Examples of how erroneous 
practices might have occurred were revealed 
in the postexperimental interview. For Judges 
3 and 4, the cue that controlled the most deci- 
sion variance was used in a multiple-cutoff 
fashion by these judges, for example, Judge 4 
reported that all applicants who were not high 
school graduates received a rating of five or 
less. A similar strategy was reported by Judge 
3 for the social skills cue, 

In summary to this point, the results of 
the present study corroborated other studies 
(e.g, Dobmeyer, 1970; Slovic, 1969) that 
reported substantial individual differences in 
effective cue weights, It appears that. in part, 
these differences are traceable to errors in cue 
processing which produce a serious discrep- 
a tra and actual cue weights. 
sehr ie of error could be greatly 
p 8 appropriate kinds of train- 
ing. 
wa, set study, 

"ger than 
of decision m 
Possibly 


configural cue use 
that reported for other kinds 
akers (eg, Slovic, 1969) and 
Somewhat greater than Dobmeyer 


Tu 


a 
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(1970) found. As was true in all of the earlier 
studies, the amount of decision variance ac- 
counted for by the interaction terms was 
quite small relative to the main effects. None- 
theless, some of the individual interaction 
terms were large enough to be of potential 
practical importance, For example, the Edu- 
cation X Social Skills interaction term ac- 
counted for 8.5% of the decision variance of 
Judge 4. When the two sets of replicates were 
analyzed separately for Judge 4, the Educa- 
tion X Social Skills interaction accounted for 
8% of the total decision variance in one sam- 
ple and 8.9% in the other sample. Moreover, 
this particular interaction term was mentioned 
by Judge 4 in a postexperimental interview 
conducted before her decision data had been 
analyzed. In brief, the configural use of edu- 
cation and social skills cues was intended and 
it was replicated. It should be noted, however, 
that the size of the interaction term might 
have been exaggerated by Judge 4's tendency 
to use education cues in a multiple-cutoff 
fashion. 

Any attempt to detect and verify configural 
cue use is complicated by the fact that 
ANOVA and regression analysis both remove 
linear variance first and then search for con- 
figurality among the variance which remains, 
In both kinds of analysis, this sequence can 
lead to a linear representation of some of the 
configural variance. To avoid this UD 
expropriation, Cohen (1968) has en» 
regression analysis technique which eura 
the configural variance first, When appliec to 
the data in the present study, Cohen’s pro- 
cedures revealed that the ANOVA had a 
Stated the amount of configural cue use by 
Judge 3. Instead of 49%, her configural vm 
Was 11% of her total decision variance. j 

hand, there was 
Judges 2 and 4, on the other ha E Wi 
= , in results; the ANOVA 
no appreciable change in 1 detecting their 
had done an adequate job of de a 
cete vith both kinds of analysis 

Having — P dala the authors are in- 
B a 1 ANOVA for the typical 
clined to ae Wi n : mewhat easier to uti- 
applied study. Tt is so 


lize and seems to detect most of the configura] 
variance which is large enough to be of poten- 
tial practical significance. Perhaps Cohen’s 
technique could be applied as a supplementary 
analysis in situations where the decision maker 
claims to use certain cues configurally, but 
his claim is not confirmed by ANOVA. 

In future research along this line, the au- 
thors strongly recommend that phenomeno- 
logical data be gathered in a postexperimental 
interview conducted before the decision pat- 
terns are analyzed, This kind of data was very 
helpful in the present study, Also, additional 
insights into the decision-making process 
might be obtained through interviews con. 
ducted after the decision patterns have been 
analyzed. Perhaps STOUD sessions in which 
the interviewers discuss their individual dif- 
ferences in cue utilization practice would be. 
most helpful. In addition to its research value, 
this kind of discussion could be of value as an 
interviewer training Procedure, 
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SUBORDINATE PERSONALITY AS A MODERATOR OF 
THE EFFECTS OF PARTICIPATION IN THREE 
TYPES OF APPRAISAL INTERVIEWS 


KENNETH N. WEXLEY, J. P. SINGH, anp GARY A. YUKL? 


University of Akron 


The current controversy about the importance of a subordinate's personality 
as a moderator oí his reaction to participative leadership is especially relevant 
to the appraisal interview situation. This laboratory study investigated whether 
two personality variables—authoritarianism and need for independence—affect 
the relationship of the amount of participation a subordinate is permitted 
during the appraisal interview with his satisfaction with the interview and 
motivation to improve subsequent job performance. The results indicated that 


it is desirable to allow a subordinate to have substantial participation in 


appraisal interview decisions, regardless of his personality structure. When both 
dependent variables are taken into account, the problem-solving style of 
appraisal interview appears to be superior to the tell and sell and the tell and 


à x listen styles. 


There is a considerable body of research 
which indicates that subordinate participation 
in decision making usually results in greater 
satisfaction and productivity (see reviews by 
Vroom, 1964; Yukl, 1971). However, some 
studies. indicate that the beneficial effects of 
a participatory pattern of supervision may 
depend on certain personality characteristics 
of the subordinates. Vroom (1959), for ex- 
ample, investigated two personality variables 
thought to be important in determining a 
subordinate’s reaction to participation, 
namely, authoritarianism and need for inde- 
pendence, He found that participation was 
positively correlated with both job perfor- 
mance and satisfaction for subordinates who 
aha in authoritarianism and high in need 
E ono Whereas for subordinates 

a personality scores, there was 
smacant correlation, More recently, 


a laboratory ex- 
usiness organiza- 
milar to those of 
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ing the appraisal interviews. help in conduct- 


authoritarianism. However, a subsequent 
study by Tosi (1970) failed to confirm the 
results of Vroom and Campion. In Tosi’s 
study, a significant positive correlation was 
found between participation and subordinate 
job satisfaction, but this relationship was not 
moderated by subordinate authoritarianism or 
need for independence. Furthermore, partici- 
pation was not significantly correlated with 
performance, regardless of the subordinate’s 
personality scores. The reason for the differ- 
ence between Tosi’s results and those of 
Vroom and Campion is not evident, although 
the difference may have been due to the lack 
of similarity between the subjects, the organi- 
zations, the nature of the work, the measure 
of effectiveness, and the geographical areas 
used. 

The controversy about the importance of 
personality as a moderator of subordinate 
reactions to participative leadership is espe- 
cially relevant to the appraisal interview situ- 
ation. A number of writers have advocated 
substituting mutual goal setting or manage- 
ment by objectives in place of the tradi- 
tional appraisal interview in which interview" 
ers praise, criticize, and perhaps set goals for 
the interviewee (Kelly, 1958; Kindall & 
Gatza, 1963; Likert, 1959; Maier, 1958: Mc- 
Gregor, 1957; Meyer, Kay, & French, 1965; 
Odiorne, 1965; Patton, 1960). However, other 
writers question the superiority of partici- 
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pative leadership within the appraisal inter- 
view (Huttner & O'Malley, 1962; Mayfield, 
1960; Miller, 1965). Only one study has in- 
vestigated the interaction of subordinate per- 
sonality and leadership behavior during ap- 
praisal interviews. French, Kay, and Meyer 
(1966) concluded that those who are high in 
independence needs, as compared with those 
who are low in independence needs, tend to 
respond more favorably to increases and less 
favorably to decreases in participation during 
appraisal interview discussions with their 
supervisors. Several limitations in the study 
raise doubts about its conclusion regarding the 
moderating effect of subordinate personality. 
First, this conclusion was based on data that 
were not tested for significance because of the 
small and unequal number of cases in some of 
the cells. Second, as the authors themselves 
pointed out, an interviewee's usual participa- 
tion in his daily relationship with his super- 
visor confounded the experimenters’ compari- 
sons between the high- and low-participation 
appraisal interviews, Finally, only one person- 
ality variable—need for independence—was 
used in the study, and the short questionnaire 
used to measure this personality variable 
lacked homogeneity. 
The absence of consistent results in the 
leadership literature and the lack of definitive 
studies on appraisal interviewing per se point 
out the fact that additional research is needed 
~in order to understand the role of subordinate 

personality as a moderator of the effects of 

participation in appraisal interviews. The pur- 

pose of the present study was to determine 
whether need for independence and authori- 
tarianjsm affect the relationship between the 
amounhof participation the interviewee is per- 
mitted during the appraisal interview and his 
(a) satisfaction with the interview and (5) 
motivation to improve subsequent job perfor- 
mance. The study was conducted in a labora- 
tory setting in order to control for tre poten- 
tial source of confounding due to the usual 
level of participation a subordinate is allowed 
by his supervisor outside of the appraisal 
interview, 

METHOD 

Subjects 

Four hundred and thirty-eight undergraduate psy- 
chology students volunteered t fill out question- 


naires measuring need for independence and authori- 
tarianism. The measure oi need for independence 
consisted of a 16-item questionnaire used by Vroom 
(1959). Authoritarianism was measured by the 25- 
item questionnaire used by Vroom (1959). It con- 
sisted of items from Forms 40 and 45 of the F scale 
(Adorno, Frenkel-Brunswick, Levinson, & Sanford, 
1950).4 Based on these two personality scores, 108 
subjects were chosen to participate in the study. 
Twenty-seven subjects were selected for each of the 
following four personality groups: (a) high need for 
independence, (b) low need for independence, (c) 
high authoritarianism, and (d) low authoritarianism.5 
The subjects for the low and high groups were chosen 
from the first and fourth quartiles, respectively, of 
the score distribution. The subjects were paid $2 for 
their participation in the study, 


Design 


The subjects in cach of the four personality groups 
were randomly assigned to three Participation levels. 
Level of participation was varied by using the three 
styles of appraisal interview suggested by Maier 
(1958). The tell and sell (TS) method allows the 
subordinate a minimum amount of participation in 
the interview. The interviewer tells the subordinate 
his strengths and weaknesses and then attempts to 
persuade him to follow the suggestions given for his 
improvement. In the tell and listen (TL) method, the 
interviewer tells the subordinate his strengths and 
weaknesses and then allows the 
portunity to express his feeling 
tion. This method permits some subordinate partici- 
pation in the interview, The problem-solving (PS) 
approach gives the subordinate maximum participa- 
tion in the appraisal interview. The supervisor uses 
nondirective, open-ended questions to encourage the 
subordinate to express his ideas and feelings about 
solutions to problems. Goals and plans for improving 
subordinate performance are agreed upon by the 
subordinate and the supervisor, 1 

Three male graduate students conducted the ap- 
praisal interviews, Each student Was given 6 hours of 
training in conducting the three styles of appraisal 
interview. During the study, each "interviewer per- 
oe pd Ute Pd d der interview the 
number of subjects from Bis uM xd n 

E ur personality groups. 


subordinate an op- 
s about the evalua- 


Procedure 


The subjects Were asked to play the role of a 
subordinate who had his appraisal interview that 
? Vroom (1959) reported a test- 
-61 for a shortened form, 
of this need for independence scale, 

4 Adorno et al. (1950) reported average reliability 
coefficients of .90 with this 25-item authoritarianism 
scale. 

ë The correlation between authoritarianism and 
need for independence was —41 for the 438 stu- 
dents, which made it appropriate to use separate 
samples for each personality variable. 


s retest reliability of 
consisting of eight items, 
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day. The role-playing exercise was taken from Maier, 
Solem, and Maier (1957). The subjects were given 
the following instructions by the experimenter: 


This is an experiment in interviewing. We want 
you to play the role of a supervisor in industry. 
Your name is Tom Burke and today is your inter- 
view with your boss, Mr. Stanley. He wants to talk 
to you about your job and your department. This 
is a regular end of the vear interview where the 
boss evaluates his subordinates by letting them 
know their strengths and weaknesses. Here is your 
role. Please read it and familiarize yourself with 
it. If you have any questions, please let me know 
and I will answer them. After you have finished 
studying your role, I will take you to your inter- 
view with Mr. Stanley. Remember, you are Tom 
Burke and this is your interview with your boss, 
Mr. Stanley. 


After the subjects had studied their role and the 
experimenter was sure that they understood it, each 
subject was directed to a room where an interviewer 
was waiting. The interviewer was unaware of the 
subject's personality structure. Each interview lasted 
about 15-20 minutes, After the interview, the sub- 
jects were asked to complete a questionnaire which 
yielded scores on each of the following variables: 
(a) psychological participation—the amount of par- 
ticipation or influence a subject perceived he had 
during his interview, (b) satisfaction—a subject's 
overall satisfaction with his interview, and (c) moti- 
vation to improve performance—a subject’s reported 
intention to strive toward the goals set during the 
interview. The following five-choice, Likert-type 
items were used: d 


A. Psychological participation 


1. I was given an opportunity to state “my side” 
of the issues, 

2. Mr. Stanley asked for my opinion about the 
problems concerning the job. 

3. Who talked the 


most during the i 5 
4. Who set the goa g the interview? 


ls during the interview? 
B. Satisfaction 


1. I was very satisfi 
2. I was always at 
3. Mr. Stanley was 
4. Mr. Stanley was 


ed with the interview, 
ease during the interview, 
very friendly, 

Very unfair, 


4 are you to 
during the interview? Em 


2. How motivated are you to im 
performance in gencral? 


e goals set 


prove your job 
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RESULTS 


The success of the manipulation of psycho- 
logical participation was confirmed by com- 
paring the mean psychological participation 
scores for the three styles of appraisal inter- 
view (PS = 15.30, TL — 12.97, and TS = 
7.30). A Newman-Kuels test revealed that 
the three means were significantly different 
from each other. Thus, the PS, TL, and TS 
interviews were clearly perceived by the sub- 
jects as being a high, a moderate, and a low 
psychological participation treatment, respec- 
tively. 

The correlation between satisfaction with 
the interview and motivation to improve per- 
formance was .54, which was low enough to 
justify treating these as separate dependent 
variables. The means for the two dependent 
variables are shown in Table 1. The effects of 
interview type and subordinate personality 
are analyzed in Table 2. A significant main 
effect for interviewer type was found in each 
analysis, indicating that participation affected 
subordinate satisfaction and motivation to 
improve performance. However, it is evident 
from the absence of significant interactions 
that a subordinate’s personality did not in- 
fluence his reaction to the amount of partici- 
pation he was allowed in the appraisal inter- 
view. Since the results were consistent for 
both personality samples, the effect of inter- 
view type was analyzed for all subjects com- 
bined (see Table 3). 

The F tests for the one-way analysis of 
variance (ANOVA) were highly significant, so 
it was appropriate to make individual com- 
parisons between each pair of interview types. 
A Newman-Kuels test revealed that motiva- 
tion to improve performance was significantly 
greater for the PS interview than for the TL 
interview and was significantly lower for the 
TS interview than for the TL interview; that 
is, the greater the participation, the greater 
the subordinates’ expressed motivation to im- 
prove performance. These results are consis- 
lent with the significant. positive correlation 
found between the subject’s psychological 
participation and his motivation to improve 
job performance (+ = 54, p < .01). 

Subordinate satisfaction with the interview 
Was not significantly different for the PS and 


E 
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TABLE 1 


faction Motivation 

Sample pou Dus x x € "s puces i 

ij TS Th PS TS TL | PS 

ce. | 0 | 162 | 66 7.00 8.78 
Low authoritarianism rd | (2:31) | (1.62) (2:16) | (20 
i itariani: 15.44 16.33 6.4 8.56 9.67 
a (2.20) (2.08) | (1.91) (1.23) (0.40) 
indepe ce 14.22 | 16.89 | 6.89 8.00 8.89 
Low need — independence "noo | ed ky aoe hy 
i independence 15.44 15.67 6.11 7.89 8.22 
sili Inia (239 | (261) (2.33) (2.32) | (156) 
14.78 16.28 6.53 7.86 8.89 
= | (2.68) | (2.35) (1.98) (1.94) (1.30) 

| \ 


; Note, Standard deviations s TS = tell and sell, TL = tell and listen, and PS = 
Note, Standare ia ü 


problem solving. 
TL interviews, but satisfaction was signifi- 
cantly greater for each of these interview 
types than for the TS interview. These results 
are generally consistent with the significant 
positive correlation between psychological par- 
ticipation and satisfaction (r = .48, p < .01). 


view is superior to the TS and the TL styles. 
These results provide empirical support for 
the position of those writers cited earlier who 
have promulgated the advantages of subordi- 
nate participation in appraisal interviews, 
The absence of an interaction 
ticipation and subordinate per: 
have been due to the particular 
The results indicated that it is desirable to supervisor-subordinate relations 
allow subordinates to have substantial partici- appraisal Interview, 
pation in appraisal interview decisions, regard- appraisal Interview 
less of the level of subordinate authoritarian- the subordinate as consid 
ism or need for independence. When both tant than the typic 
interviewee satisfaction and motivation to im- sions of the Work group. Most subordinates 
prove performance are taken into account, it Would probably find the appraisal interview 
is clear that the PS style of appraisal inter- — considerably less threatening if they had some 


between par- 
sonality may 
nature of the 
hip during an 
Decisions made during 
8 are likely to be viewed by 


Discussion 


erably more impor- 
al day-to-day task deci- 


TABLE 2 


ror EFFECTS OF I 


JORDINATE 


PERSONALITY 
Satisfaction | Molivatigs 
Source $a Eccc P - j 
df PF MS P E 
Type of interview (A) 2 736** 307 c — X 
Authoritarianism (B) 1 2.74 nU 12.03e* 
AXB 2 | 0.68 3.63 2 
Within cell 48 dics 1.36 
| 
= : iew (A 2 39.02 92* 
Type of interview (A) 2 4.92 B g 
2$ -independence (B) 1 0.67 0.08 E 0.06 5.44 
Need-indey D nu idi voee 
E ie 0.93 es Es 
Wii i -ell 25 7.93 0.57 0.16 
ithin ce | 355 
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TABLE 3 


ANALYSIS OF VARIANCE FOR THE EFFECTS 
OF INTERVIEW Tyre 


| Satisfaction | Motivation 


Source | 5 I = 


df | MS | F | MS | F 


Between groups | 2 | 98.62 | 12.44* | 50.45 | 16.11* 


Within groups | 105 | 1.93: | 3.13 | 
| | 


*p <1. 


influence over the outcomes and if the focus 
of the interview was on solving problems and 
improving performance rather than on evalu- 
ating the interviewee. For this reason, sub- 
ordinates can be expected to prefer some de- 
gree of participation in appraisal interviews, 
and this preference will probably be strong 
enough to mask any effects of personality 
differences among subordinates. Furthermore, 
commitment to goals is more likely to result 
when subordinates, regardless of subordinate 
personality, participate in setting performance 
goals than if these goals are prescribed by 
the interviewer in an autocratic fashion. 
nec ees, Fis sy 
interviewees’ subse doy Sah E z 
E quent improvement in job 
performance could be obtained in this labora- 
ES dr erra = is some evidence 
Pt both asics E urke and Wilcox (1969) 
E 2. h ac igi with the appraisal in- 
xpressed motivati i 7 
performance ae zs on to improve 
ated to actual improve- 


ce. These authors found 


aware of the hypotheses being i Were un- 
over, if a demand effect had ested, More- 
correlation between the two de Secured), the 
bles would have been consi pendent varia. 


l derably hig 
the .54 correlation that was did The 
- The 


magnitude of this correlation is not much 
higher than the correlation (r= .43) ob- 
tained between the same variables in the field 
study by Burke and Wilcox (1969). 
Conclusions from laboratory research on 
appraisal interviews should not be uncondi- 
tionally generalized to actual organizations 
until confirmed by field research. Neverthe- 
less, the considerable difficulty of conducting 
field research on appraisal interviews sug- 
gests that laboratory simulations and role- 
playing experiments can be useful methods 
for conducting preliminary investigations of 
the appraisal interviewing process. 
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WORKER ADJUSTMENT TO THE FOUR-DAY WEEK: 
A LONGITUDINAL STUDY 
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St. Mary's Mission School, St. Marys, Alaska 


A longitudinal, exploratory study of employee responses to the four-day work 
week was conducted in a medium-sized pharmaceutical company. While reac- 


tions were generally positive, the patterns of re 


ponse changed with time, 


After 1 year, differing effects of the four-day week seemed to be associated 
with job pace, worker plans to use their leisure time, and age. Absentecism 
decreased after the change and declined more 1 year later, and workers re- 
ported sleeping less and having more unfavorable effects on home life. Women 
reported more favorable effects on home life and task-oriented plans than men. 


The four-day work week has been and most 
probably will continue to be introduced in 
many organizations. This change may have 
profound effects on the lives of many indi- 
viduals. While considerable speculation and 
anecdotal information has been published, to 
date there is little reliable empirical evidence 
about the effects of the four-day week on 
workers, 

The only major empirical study that could 
be found was reported by Steele and Poor 
(1970). Their data showed that workers over- 
whelmingly saw the change as beneficial for 
their lives at work and at home. However, 
because Steele and Poor’s study was primarily 
descriptive and cross-sectional and the authors 
did not state how long various parts of their 
population had been on the four-day week, in- 
ferences concerning individual adjust: 
time were not possible, 

_ The Present exploratory research was de- 
Signed to provide longitudinal data on the 
responses of people to the four-day work week. 
gr was expected that reactions to the 

ay week would ‘change over time, the 


design called for p J 
epeated meas ; 
the same plant, measurements in 


ment over 


h METHOD 
Subjects 


The data were collected from employees of a 
unionized medium-sized St, Louis-based bh Ar is x 
à s-bas armaceu- 


1 Requests for reprints should be se: à 
Nord, Graduate School of Hide Adan nt E 
Washington University, St. Louis, Missouri EDS 
2 The authors wish to thank Francis Connelly f 
his help with parts of the data analysis, Eee 
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tical company that had recently changed from à 
work week of five 8-hour days to four 94-hour 
days? The plant employed approximately 100 mem- 
bers of each sex for whom the ge age was in 
the late 40s. The employees were described by man- 
agement as a closely knit group; many of the work- 
ers were related to each other. Only foremen, group 
leaders, and lower level employees who were working 
in the plant for at least 10 of the 12 months covered 
by the survey and were employed there at the time 
of both the first and the third administrations of the 
questionnaire were included in the analysis. From this 
pool, 131, 126, and 111 usable questionnaires were 
received for Surveys 1, 2, and 3, respectively. For 
most analyses all the subjects were included; how- 
ever, for several analyses only those 59 subjects who 
responded to all measurements were included. — 

The plant itself was highly automated. Many jobs 
were heavily machine paced; for the most part work 
on these jobs consisted primarily of monitoring Ma- 
chinery. 


Procedure 


A questionnaire was designed to elicit informa- 
tion on demographic factors, attitudes toward the 
four-day week, and changes in work and home life 
resulting from the new work schedule. The questions 
were mainly open ended, although some closed and 
scaled items were used. The questionnaire was dis- 
tributed with each person’s paycheck at three dif- 
ferent times. The first administration was July 7, 
1971, 6 weeks after the initial trial period had be- 
gun; the second administration was August 24, 1971, 
13 weeks after the initial trial period had begun; and 
the fina] administration was May 23, 1972, approxi- 
mately 1 year after the initial trial period had be- 
gun. The same questionnaire was used each time, €X- 


* While initially the change was on a trial basis: 
there was every indication that it would be perma- 
nent if it was successful. Approximately three months 
after the beginning of the trial period, it was for- 
mally announced that the change would be perma- 
nent. 


Worker ADJUSTMENT To THE Four-Day WEEK 


TABLE 1 


OPEN-ENDED QUESTIONS AND MAJOR CATEGORIES AND ELEMENTS FOR CODING 


Dichotomous coding category | Element 


1. What plans (if any) have you made for the three-day weekend? 


a. Plans versus no plans. 
b. Type of plan—task or 
recreational. 


a, Plans listed or not. 


b. Task—shopping, doctor’s appointments, work around house, moon- 
lighting, special work projects. 


Recreational—sports, rest and relaxation, visit relatives and socialize, 
| travel. 


2. In the last, weeks, have you noticed anything different about your job? If yes, please describe. 


Job changes—favorable versus 


| Favorable to company goals—more work done, higher employee morale, 
unfavorable to company goals. 


| more relaxed at work, employees busier, better continuity of projects, less 
absenteeism, better adjusted. 


| Unfavorable to com pany goals—less work done, greater fatigue, lower em- 
ployee morale. 


3. In the last — weeks, have you noticed anything different about the plant (or office)? 
If yes, please describe. 


Changes in plant—favorable versus 


| Favorable to company goals—more work done, higher employee morale, 
unfavorable to company goals. 


more relaxed at work, employees busier, better continuity of projects, 
less absentecism, better adjusted. 


lor 
Unfavorable to company goals—less work done, greater fatigue, lower em- 
ployee morale. 


4, In the last — weeks, have you noticed anything different about your home life? 


Tf yes, please describe. 


Changes in personal life—favorable 


Favorable to self—more time with family and friends, more rest and relaxa- 
versus unfavorable to self. 


tion, get more done at home, other happiness feelings, moonlighting, better 

| adjusted. 

| Unfavorable to self—less time with family and friends, problems—e.g., meals, 
Es 5, 


housekeeping, more tired after work, emotional problems, spend more 
| money. 


cept for changes in time references and the addition 


of a question after each survey. 

Interviews with the personnel and other managers 
were used to supplement the questionnaire data but, 
except in interpreting the absenteeism data, are given 
little attention in this report. 


Analysis 


Responses to the closed-end items were simply 
recorded, The open-ended responses were coded into 
the categories listed in Table 1. The coded and 
closed-end items for each survey were cross tabu- 
lated. Only the cross tabulations that were relevant 
to the effects of the four-day week were run; such 
relationships as age versus sex OY age versus number 
of children were not considered. 


ey ignit Sverail attitudes, subjects were asked 
check one of five statements that best described 
their feelings about the old and new work weeks. 
Two statements favored the four-day week, one ; 
neutral, and two favored the five-day veek For 
m ROM the two favorable statements 
tu E pe as showing favorable overall atti- 
Luis E e our-day week; the other three were 
: = as showing less favorable attitudes. 
ntormation was also collected about the subjects 


themselves, The d 
s 1 ena peel: 
marital status, ographic variables of sex, age, 


dichotomized a unu PM ee 
41 and over, marri d pus) don "s upeegeec 
n a : Tied versus not married, and one or 
vei ecc iving at home versus none living at 

$ addition, data on the average number of 


"eee, L USTIGAN 


UTI ! " 

Tm Ki 1 year attitudes of workers on low-paced jobs 
Cross TABULATIONS oF ATTITUDE TOWARD Four-Day tended to be less favorable towards the four- 
WEEK AND Jon Pace Over TIME day week than those of workers on high-paced 
xs | jobs (x2 = 4.21, p < .03). To further explore 
the correlates of attitudes and job pace, chi- 
7 square and/or Fisher exact probabilities were 
6 weeks 13 weeks 1 years calculated for the relationship of pace with 
Job = ~~ perceived changes in home life, in the job, and 
| Favorableness of attitude in the plant for each survey. None of these 

| " ~; — tests reached statistical significance, 

Low | High | Low High Low | High Overall attitude and selected demographic 
Low-paced | 12 | 4g. E: 135 | ag | ap variables. The results of cross tabulations of 
High-paced | 10 | sa 8 | 45 6 | 49 attitude with age, sex, marital status, and chil- 
mM M i—— S| living at home Were subjected to chi- 
Square analysis for each of the three surveys. 
No significant associations between attitudes 


Í 


Time since initiation 


^x! 241, p & 5, 


hours of sleep and data on Weight were included for 


eae and these demographic factors appeared for 
analysis, B 
: ; any of the three p riods. 
Finally, the jobs were Classified as high or low z M d j " v Es A 
paced. High-paced jobs were ones where the workers Overall attitudes and specific attitudinal di- 


pace was determined primarily by an assembly line mensions. To determine what specific attitudes 
oF a machine; these jobs were mainly in shipping might account for differences in overall atti- 
pr e E, and packaging. Low-paced tudes, all of variables in Table 1 were 
Jobs included Office, Janitorial, maintenance, and o ael ae bis ie n ral à 
cafeteria personnel, tested for their association with overall atti- 
Statistical analysis, Due to the small number of tudes. Chi-square analysis was used to test the 
responses for each coding element shown in Table 1, association between worker’s overall attitudes 
t c 5 Were used as the unit mber of « ific plans. TI se data 
for analysis, The NUCROS and CROSTAB Programs ana ihe g Vt G 1 Ex ples: “a 
Were used to cross tabulate the results of pairs Ui summarized In Table 3, indicated t at people 
without plans held less unfavorable overall 
al tests were appropriate attitudes toward the new work week than 
people with plans both 6 weeks and 1 year 
comparison, To test the differences between time after vns Chatige (t= sais d = 00! and 
of variance or tests of differ. X? = 6.12, p= < “Ot, vespectively). While 
sales : s : 
tionships Within Portions Were used. To test rela- many people without plans were Still favor- 


TABLE 3 


RzsupTg FREQUENCY oF ATTITUDES TOWARD THE Four-Day 
VERSUS THE PRESENCE OR ABSENCE oF 


fo: " U-Day Week PLANS FOR THE Turer-Day Wr 

al E 

toward the kd s quera "es s. Te = = 
‘ SSS 
z i v - |; PPS 

E s hly favor l 19% te Time since initiation 

: p entages remaj d : 

ine | 
P. veg three Surveys, d Nearly con- | 6 weeksa | 13 weeks 1 year 
tude and ace. Tab), 2 shows the f Attitude - 

quency of responses for € Ire. | Plans 
four-day week for indivi 3 toward the 


lviduals under hi 


low-paced jobs, Tt can b igh- and ] ^ 3 
f » € Seen that i Ye | N Yes | No | Yes | No 
periods, Workers favored the ty: all three | Yes 5 5 | 
for both types of jobs, While .Ay Week Favorable 62 | 44 | 51 | so | 43 | 42 
association between job pace ud hone Unfavorable 2 | 21 8 | 14 4 | 16 
Was found 6 or 13 weeks after initiation, res 235612. 3. 01 
safter — 32,5 7 01, 


x 18.74, $ <.001, 
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TABLE 4 
Proportions oF Reports oF CHANGES TO TOTAL 
NUMBER OF RESPONDENTS BY TIME PERIOD 


P = == 


Time since initiation of 
four-day week 
Type of change 


6 weeks 13 weeks | 1 year 
(N = 131) | (W = 126) (Y = 111) 
Plant A3 A3 16 
Job 20 M 32 
Home life AS BV ios Sgtee 
Plans 79 08 83** 


* Different from pi, p < .05. 
** Different from pa p < 10>. 
+ Different from pr, P & .01. 


able to the four-day week, a large majority of 
those who held less favorable attitudes did not 
mention any plans for the three-day weekend. 
The mention of plans was also positively as- 
sociated with favorableness of the reported 
changes in home life 6 weeks after initiation 
of the four-day week (y? = 5.28, p < .05). 

Further analysis revealed that the type of 
plans people had (recreational vs. task ori- 
ented) were not related to their attitudes after 
6 and 13 weeks. However, 1 year after initia- 
tion people with recreationally oriented plans 
were apt to show less favorable attitudes 
toward the four-day week (Fisher's exact test, 
p « .001). 

The association of overall attitudes with 
changes in job and plant were tested for each 
period using Fisher's exact test. No signifi- 
cant associations were found between attitudes 
and perceived changes at any time period. 
However, after both 13 weeks and 1 year, 
workers who perceived the changes as favor- 
able to company goals held more favorable 

‘attitudes toward the four-day week (p= 
-0006 and p = .0036, respectively). 


Specific Attitudes over Time 


In order to see what specific changes oc- 
curred over time, the proportions of the num- 
ber of reports of perceived changes in job, 
plant, home life, and the number of plans to 
the total number of subjects in each survey 
were calculated. The results presented in Ta. 
ble 4 showed a significant tendency for people 
to perceive more changes in their home life 


after 6 weeks than aíter 13 weeks (Z — 2.22, 
P «.05) and 1 year (Z = 2.66, p < 01). 
Moreover, they reported fewer plans after 13 
weeks than after 1 year (Z = 2.55, p < .05). 
Similarly, they reported making fewer plans 
after 13 weeks than after 6 weeks (Z= 1,825 
P <.07). Also, they perceived more job 
changes over time: after 1 year more job 
changes resulting from the new work week 
were reported than after 6 weeks (Z = 1.93, 
P < .06). Thus, over the period of 1 year sub- 
jects tended to report proportionately fewer 
changes in home life but slightly more changes 
in job life. 

The data were tested for qualitative changes 
over time. Tests of proportions compared the 
three periods on changes in plant, changes in 
job, changes in home life, plans versus no 
plans, and type of plans. No significant differ- 
ences in proportions between time periods 
were found for changes in plant, changes in 
iob, plans versus no plans, and types of plans. 
However, as shown in Table 5, 6 and 13 
weeks aíter initiation 63% and 64% of the 
effects on home life were seen a 
whereas after 1 year only 45% were seen as 
favorable. The differences between the first 
two and the third periods were both statisti- 
cally significant (Z = 2.13, 2.12; p < .05, te- 
spectively). For the most part these differ- 


s favorable, 


TABLE 5 


FREQUENCY AND PROPORTIONS oF R 
IxprcaTING E s 


ONSES 
: Home Lire 


Time since initiation 


| 
i | 6 weeks | 13 we eks "e 
Effect on home life Bi neto: 


€ s. € 


Favorable 
life 

Unfavorable to 
Personal life 


to personal | | 
63| 63 47 | 66|29| 45 


36 | 37|24 
T a | 
otal 99 100 71 100 64 100 


— 


= Proportions di 
"s differed deni 
b «05. Hiffered significantly from 6 and 13 ieee, 


TABLE 6 


FREQUENCIES OF Type oF PLANS ror THREE-DAY 
WEEKEND VERSUS AGE BY Tise PERIOD 


Time since initiation of four-day week 


| 6 weeks 13 weeks 1 yearb 
Plans — - — - 
| 
| 
40 and) 4,4 | 40 and | p4 [40 and | yy 
| under | under under r 
jala [HÀ = 
Rec 30 | 29 35 30 22 
Task oriented | 19 | 26 9 122 | 28 
ax? = 4.08, p < (05, 
b x? = 6,99, p <0, 


ences were due to workers reporting few favor- 
able effects after one year. 


Personal Factors and the Four-Day Week 


Data on the personal variables (a) hours of 
sleep and (5) weight were collected after all 
three periods. For each dependent variable, 
an analysis of variance for repeated measures 
was performed on the 59 subjects who an- 
Swered these questions in all three surveys. No 
Statistically significant weight changes were 
found, However, after 1 Year on the four-day 
week, workers averaged 6.72 hours of sleep 
per night as compared to 7.05 hours per night 
on the five-day week and 6.98 and 7.02 
hours per night after 6 and 13 weeks on the 
four-day week, This main effect was statisti- 
cally significant (F = 2.95 
Most of this effect w. f 
hours of sleep repo 
four-day week, 
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Demographic factors. The appropriate test 
of association was run for age versus changes 
in home life, types of plans, changes in job, 
and changes in plant on the data from each 
time period. Of the 12 tests, only the relation- 
ship between worker’s age and the type of 
plans they made reached an acceptable level 
of significance. As shown in Table 6, 13 weeks 
and 1 year after initiation, older workers 
tended to make more task plans and younger 
workers made more recreationally oriented 
ones (x* = 4.08, p < .05; x? = 6.99, p < .01). 

The appropriate tests of association of sex 
of worker versus changes perceived in plant, 
home life, plans, and job were run for each 
survey. Of the 12 tests, 3 reached acceptable 
levels of significance; all 3 of these relation- 
ships appeared in the third survey. As shown 
in Table 7, after 1 year females (more than 
males) saw the change to the four-day week 
as resulting in changes on their job which 
were more favorable to the company and 
changes in their home life which were more 
favorable to them. Moreover, females made 
more task-oriented plans and males made 
more recreationally oriented plans. Again, 
these relationships reached statistical signif- 
cance only after 1 year. 

The appropriate test of association was run 
for marital status versus home life and plans 
for all three surveys. None of these relation- 
ships was statistically significant. 

In the surveys conducted 13 weeks and 1 
year after the change, information on the 
number of children living at home was col- 
lected. The appropriate test for association 
was run on children living at home (none, one, 


TABLE 7 


AR FOR SEX VERSUS CHANGES IN Jor, PLANS, 


= — AND CHANGES IN HOME LIFE 


p Plans’ | Changes in home lifes 
Sex | Favorabl i 9| ES Es | x m * 
| “@vorable to | t; able | 
| company om i | Task | Favorable Unfavorable 
| goals AREH | Recreation | agente | to personal to personal 
^ als | | | life life 
y | | | 
M ale j4 10 | " | f 
Female 10 1 | 29 | 23 13 16 
P | | 14 | 26 25 10 
| | 


exact probability test) 
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or more) versus plans and home life. Of the 
four tests, only one relationship reached sig- 
nificance. One year after the change, people 
with one or more children living at home more 
frequently felt the four-day week had un- 
favorable consequences on their personal life 
(x? = 4.66, df = 1, p < .05) than did people 
with no children living at home. 


Absenteeism 


The average hours of absenteeism per day 
for the 5 months prior to and the 16 months 
following the introduction of the four-day 
week are reported in Table 8. In order to con- 
trol for the partial confounding of the change 
in the length of the work week with normal 
seasonal variation, two sets of comparisons 
were made. 

1. The January-May 1971 period (prior to 
the four-day week) was compared with both 
the June-September 1971 period (the first 
4 months following the new work week) 
and the January-May 1972 period. 

2. The June-September 1971 period was 
compared with the October-December 1971 
and June-September 1972 periods. 


As the footnote to Table 8 implies, the pe- 
riods were not directly comparable due to a 
10% reduction in the work force. However, 
even if a full 10% was subtracted from the 
figures for the earlier periods, absenteeism 
it vould still be less than it was 


after initiation W iun 6 ids 
. The January-May gure 
before. Thk 1 y ] to 37.7 for the 


1095 is about 42.7 comparec 

euius dd 1972 period and 40.5 for the 
June-September period of 1971. Similarly, 
the June-September 1971 figure minus 1096 
is about 34.6 compared to 30.2 for the com- 
parable 1972 period, although the October- 
December 1971 period would appear to be the 
same or slightly higher than that of the pre- 
introduction period. Thus, when seasonal fac- 
tors and changes in the number of employees 
are controlled for, the four-day week was as- 
sociated with a decrease in absenteeism of 
over 1046. Moreover, à comparison of the 4- 
month period immediately following the in- 
troduction of the four-day week was com- 
pared with the comparable period 1 year later, 
the level of absenteeism was approximately 


10% lower for the latter period. 


Four-Day WEEK 


TABLE $ 
HOURS or ABSENTEEISM PER Work Day 


Period 
Year | N ac 
| January- June- October- 
| May September December 
1971 47.4 40.5 44.5 
1972 | 37.7 30.2 | Not available 
Note, During the 21-month period, employment declined 
approximately 1067, Thus, the early 1971 figures are based on 
slightly more people than the later 1971 and the 1972 figures. 
Discussion 


The results of this study must be inter- 
preted cautiously for several reasons. First, 
the data were collected from only one plant 
for 1 year. Second, the differences between 
some of the periods were confounded with 
seasonal variations and other uncontrolled 
forces. Finally, the effect of repeated ques- 
tioning is unknown. Given these limitations, 
this exploratory study still revealed several 
trends to direct future research. In particular, 
employees had consistently positive attitudes 
toward the four-day week. They saw the 
change as favorable to company goals and as 
having little effect on their individual work 
environment throughout the research period. 
The perceived favorableness of the effects on 
home life varied over time. One year after the 
change, the effects of the four-day week on 
home life were perceived as significantly less 
positive than at first. i 

In addition to these findings, several possi- 
bly significant patterns emerged. 

1. Most reports of unfavorable changes were 
home rather than work related. For all three 
surveys, only 37 responses indicating un- 
favorable changes in job and changes un- 
favorable to company goals were reported. By 
contrast, the corresponding number of un- 
favorable changes in home life was 95. More- 
over, the proportion of negative changes on 
home life increased significantly over time; no 
similar changes occurred on work-related re- 
sponses. 

2. Although the data on absenteeism wer 
confounded with decreases in employmen 
and seasonal factors, absenteeism 12 to 1 
months after the introduction of the four-da 
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week was approximately 10% less than the 
4-month period immediately following the 
change. 

3. Perhaps the most important implication 
of the present study for future research con- 
cerns the observed tendency for few signifi- 
cant patterns of response to occur soon after 
the change but for a larger number of pat- 
terns to become apparent over the period of 
1 year. While a certain number of statisti- 
cally significant results were to be expected as 
a function of the number of tests run, the 
tendency for significant results to appear after 
1 year but not after 6 or 13 weeks suggests 
that many of the effects of the four-day week 
may be stronger over time. Thus, studies of 
the four-day week may yield sharply different 
results depending on how long after the 
change they are conducted. Individual adjust- 
ments to changes which effect established 
daily routines may take some time to develop 
into new stable patterns. 

The specific content of some of the relation- 
ships that emerged over time also merit 
further investigation. The fact that after both 
6 weeks and 1 year workers who reported 
making plans to use the three-day weekend 
were significantly more likely to perceive the 
new work week favorably draws attention to 
social psychological problems of leisure, While 
the findings in this study could be due to such 
factors as personality characteristics, if they 
are replicated in other research, organizations 
gehen in use of leisure to be a 
Ern: A t de. sop for this argu- 
effects on home lite se eget ou 

€ to become more negative 
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over time. Perhaps when the novelty of the 
change subsided, some individuals, particu- 
larly the roughly 20% who had not made any 
plans, found the larger block of leisure time 
less attractive than at first. 

The data also suggested that age and sex 
may be related to plans to use leisure time. 
Since younger and male workers were more 
apt to make recreationally oriented plans than 
older and female workers and females were 
more apt to see the influence on home life as 
being more favorable, perhaps females experi- 
enced interrole conflict between their job out- 
side the home and their role of “homemaker.” 
Their new full day of “leisure” permitted 
them to catch up on the “housework” that 
the social norms of our society still require 
of women. Their new leisure time was more 
“structured” and they felt better about their 
performance of one of their major social roles. 
Males, by contrast, may have had fewer tasks 
required by their nonwork roles. They made 
more recreational plans but did not feel the 
increment in satisfaction from a reduction in 
interrole conflict that females might have ex- 
perienced. If future research replicates these 
findings, and our sex roles do not change 
radically, it may well be that the four-day 
week will be of greater benefit for females 
than for males. 
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SATISFIERS AND DISSATISFIERS AMONG WHITE-COLLAR 


AND BLUE-COLLAR EMPLOYEES 


EDWIN A. LOCKE? 
University of Maryland 


“accidental” samples of white- and blue-collar employees described 
and dissatisfying job incidents. These incidents were categorized 
new (event-agent) classification system developed by Schneider and 
Locke to overcome certain limitations in Herzberg’s system. It was found that 
the same categories of events led to both satisfaction and dissatisfaction within 
each job level. However, different agents were seen as responsible for these 
events—the self for satisfying events and others for dissatisfying events. 
White-collar employees in one sample mentioned task events significantly 
more often and reward and context events significantly less often than blue- 
collar workers as sources of satisfaction and dissatisfaction, This finding was 
not replicated in the second sample. A possible explanation, based on differ- 
ences in the occupational makeup of the two samples, was offered for this 


discrepancy. 


In an extensive review of the job satisfac- 
tion literature, Herzberg, Mausner, Peterson, 
and Capwell (1957) concluded: 


One of the most consistent findings is that Intrinsic 
aspects of the job are more important to employees 

. at higher occupational levels. On the other hand, 
Security appears to be less important to these same 
employees [p. 54]. 


Friedlander (1965) used direct ratings of 
importance on a 5-point scale to compare 
white- and blue-collar employees at one loca- 
tion of a branch of the United States Govern- 
ment. He found that white-collar employees 
rated social-environmental factors (security, 
work group, co-workers) as significantly less 
important and intrinsic task factors pese 
ment, challenge, use of abilities) as signili- 
cantly more important than blue-collar em- 
ployees. There were no marked differences 
between these two groups with respect to 
what Friedlander called “recognition through 
advancement” (which included recognition, 
responsibility, and promotion). 

More recently, Armstrong (1971) com- 
pared engineers with assemblers using the 
same type of importance ratings as ae 
lander, but a somewhat different method o 

1The author 
Anne Locke, R i 

Ca Me Business Administration, Uni- 


Locke, Department of à an 
verity ok Maryland, College Park, Maryland 20742. 


js indebted to Esther Kamelgarn, 
ichard Parker, and Charles Rosolic 


67 


classification. Armstrong’s system was based 
on Herzberg, Mausner, and Snyderman’s 
(1959) content-context dichotomy. The engi- 
neers generally ranked job content factors 
(recognition, responsibility, achievement, pro- 
motion, and work itself) higher and the job 
context factors (salary, security, status, super- 
vision, peer relations, company policy, and 
working conditions) lower than the assemblers 
(based on mean within-group ranks °). With- 
in the content category, however, the largest 
differences in ranks were for achievement and 
the work itself (mean difference in ranks — 
7.5) and the smallest for recognition, respon- 
sibility, and promotion (mean difference in 
ranks — 3.7). 

Gluskinos and Kestelman (1971) obtained 
results similar to Armstrong's using a direct 
ranking method in a comparison of manage- 
ment and office workers with factory workers 
in one manufacturing plant. i 

Centers and Bugental (1966) obtained re- 
sults similar to those of Friedlander (1965) 
and Armstrong (1971) with a cross-sectional 
sample of working adults in a major urban 
area. These subjects were asked to rank six 
factors in terms of their (judged) importance: 
in “keeping you on your present job [p. 


* Armstrong evidently did not perform statistica] 
tests between groups on either the importance rat- 
ings or the rankings derived from these ratings. The 
mean ranks discussed throughout the paper 


we 
computed by the present author, 
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194]." Centers and Bugental found that 
white-collar workers were more likely to rank 
work interest, use of skills, and intrinsic satis- 
faction and less likely to rank pay, co-work- 
ers, and security among the top three factors 
than were blue-collar workers. 

Only one study to date has attempted to 
compare the determinants of satisfaction 
among white- and blue-collar workers by using 
a critical incident approach (Myers, 1964). 
Myers’ study used Herzberg’s (Herzberg et 
al., 1959) critical incident method, but it in- 
cluded only one blue-collar sample—female 
assemblers. Unlike the white-collar sample, 
this group rarely mentioned either advance- 
ment or responsibility as sources of either sat- 
isfaction or dissatisfaction. However, Myers 
claimed that there was little difference be- 
tween this group and the white-collar groups 
on other factors (e.g., achievement, pay, etc.). 
While this pattern of results appears to differ 
from that obtained by Friedlander and Arm- 
strong, Myers’ results cannot be taken as con- 
clusive. First, no statistical tests were per- 
formed on the data. Second, and more im- 
portant, the data themselves cannot be mean- 
ingfully interpreted because of limitations in 
the incident classification system used. 
Schneider and Locke (1971) have shown that 
the Herzberg system confuses two levels of 
analysis, namely, events (what happened) 


and agents (who made it happen). They a 
gue that oe ae 


‘ A logically adequate classifica 
incidents would hay 


and agents. Anythin 


í tion system for job 
€ to classify separately by events 


g that causes a person ti i 
dbi g auses a 0 be satis- 
fied or dissatisfied should be perceived. (at least in 


pt being the result. of some event which oc- 
v es some perceived condition (e.g., success, 
Praise, etc.) ; and every event or condition which 


dipl pese. perceived as being caused by some 
something (e.g., super 1 ) 
E 5» Supervisor, self, nature, etc.) 


f Herzberg's System classifies the reported in 
cidents sometimes accordi 1 


n A ng to e 
sometimes according to a vent and 


e 1 gent, but it d 
consistently classify them according 6 dir 
Thus, the data categorized with i 


this syster 

cannot be interpreted in a consis Totei 
stently 

manner, Hy logical 

Schneider and Locke (1971) 


! ; developed 
new classification system based o; E 


n the event— 


agent dichotomy. Using several predominantly 
white-collar samples, they found that the 
same categories of events produced both sat- 
isfaction and dissatisfaction, although differ- 
ent agents were judged to be primarily re- 
sponsible for these events (the self for 
satisfying events and others for dissatisfying 
events). 

The two purposes of the present study were 
(a) to attempt to replicate this finding on 
both white-collar and blue-collar subjects and 
(b) to use this new classification scheme to 
compare white- and blue-collar employees with 
respect to sources of satisfaction and dissatis- 
faction. 

The present study utilized two separate 
samples of both white- and blue-collar em- 
ployees. Each of these samples was “acci- 
dental"; that is, there was no attempt made 
to sample a given organization, industry, loca- 
tion, or occupational group (although there 
was matching on age and sex). The data in 
each sample were collected as part of a course 
project by undergraduate students in business 
administration from among their friends and 
acquaintances who were working on full-time 
jobs. While there were some disadvantages to 
this procedure (e.g., it is not systematic), the 
diversity of the occupations involved pre- 
sented a test of the “robustness” of the rela- 
tionships being studied. Furthermore, the use 
of two different samples gathered at different 
times allowed the opportunity for replication. 


METHOD 
Study 1 


Sample. Each student interviewed one white-collar 
and one blue-collar employee. The members of the 
pair had to be (a) of the same sex, (b) within seven 
years of each other in age, (c) have been employed 
for at least six months on a full-time job, and (d) 
must not have been interviewed by any other student 
doing the same project. Usable protocols for 36 
white-collar and 31 blue-collar employees were ob- 
tained. (Protocols which included incidents that were 
not specific or bounded in time, e.g, “I am happy 
with my pay,” were discarded.) 

Forty-four percent of the white-collar subjects and 
48% of the blue-collar subjects were under 30 years 
of age, and 1365-1495 of the subjects in cach group 
were 50 years old or older. Seventy-five percent of 
the white-collar subjects and 87% of the blue-collar 
subjects were male. 

Procedure, Each interviewer obtained one satisfying 
and one dissatisfying incident and then a second 
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satisfying and dissatisfying incident from each em- 
ployee. The order of asking for incidents was counter- 
balanced both within and across subjects. The inter- 
view questions were a slightly modified version of 
those used by Herzberg et al. (1959): 


Please try to think of a time when you felt espe- 
cially Good(Bad) about your job. Can you think 
of such a high(low) point in.your feelings about 
your job? . . . Please tell me about it; that is, 
tell me what happened to make you feel that way. 


Event and Agent Categories. Descriptions of the 
satisfying and dissatisfying (hereafter to be called 
good and bad) event categories are shown in Table 
1. Subjects whose event(s) fell in Category 11 were 
dropped írom the analysis. The agent categories were 
the following: 


. Self (the respondent himself). 

Supervisor or other specific superior or superiors 

of respondent. 

3. Co-worker(s) of respondent (someone at same 
level in organization or profession). 

4, Subordinate(s) of respondent (someone at lower 

level in organization or profession). 

Organization, management or organizational 

policies. (No particular person or persons cited.) 

6. Customer(s) of respondent (including students, 
patients, buyers, etc.). 

7. Nonhuman agent (nature, machinery, weather, 

neighborhood, equipment, God, etc.). 


8. No agent indicated (e.g., luck, the breaks, that’s 
the way it is, or do not know, or unclassifiable) . 


ne 


tA 


These categories constitute a refinement and elabora- 
tion of the category system developed by Schneider 


and Locke (1971). 

Analysis. As part of the pr 

B ves were read the even n i 
ape ay to classify their own incidents. With 
respect to the agent categories, the employees X 
classifications were used since these appeared to, d 
less prone to blatant errors. The employees a 
categorizations were not used, however, since i 
were a number of instances in which the classi ak 
tions were clearly wrong, that is, incompatible with 
the actual incidents described. (Grigaliunas & Herz- 
berg, 1971, have warned about the possibility of 
subjects misclassifying their own incidents) . Since 
the student interviewers had relatively little training 
in interviewing, they were instructed not to pursue 
such discrepancies very far, for fear it would ue 
the results. Furthermore, at the time of data col E 
tion the students did not know anything about the 
Herzberg theory and had not been individually 
ete in how to use the classification scheme. 

= 

3 oject participants, however, were given a 
full pent of instruction and fatta cs 
procedures. It should be noted ipe w - the: salts 
using the classifications made by t pde de 
selves were compared with the results obtained using 


ocedure the employees 
t and agent categories 
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Three “Herzberg-naive” undergraduate students 
with grade point averages of or higher were hired 
to read and categorize each of the stories. Based on 
his judgment as to what event appeared to be most 
important, the coder was to make only one event 
categorization per incident. The third coder's task 
was solely to resolve contradictions between the first 
two. When all three coders disagreed, the present 
writer had to make the final decision. This was 
necessary in 10 cases out of 268 (the latter based on 
67 subjects, four stories each). These were the cases 
in which there was genuine ambiguity as to the most 
important incident, 


Study 2 


Sample. The sampling procedures used in this study 
differed from those in Study 1 in only one respect: 
Instead of interviewing one white-collar-blue-collar 
pair and asking each employee for two satisfying and 
two dissatisfying incidents, each student interviewed 
two white-collar-blue-collar pairs, but asked each 
respondent for only one satisfying and one dissatis- 
fying incident. Protocols involving either nonspecific 
incidents or those which could not be coded due to 
ambiguity were discarded. Codable protocols were 
obtained from 94 white-collar and 66 blue-collar em- 
ployees. (As was the case in the first study, blue- 
collar employees were more likely to give nonspecific 
incidents than were white-collar employees.) 

Fifty-two percent of the white-collar sample and 
52% of the blue-collar sample were under 30 years 
of age, and 2% of the white-collar employees and 
6% of the blue-collar employees were 50 years old 
or older, Sixty-eight percent of the white-collar sam- 
ple and 79% of the blue-collar sample were male. 

Procedure. The procedure used in this study dif- 
fered from that in Study 1 in two respects: To make 
it easier for the coders, the employee himself was 
asked to indicate (a) which particular event in the 
incident he (had just) described was the most im- 
portant in making him feel good (bad) and (b) 
who or what was most responsible for this event. 
The employee himselí, however, did not make any 
categorizations. 

] Event and agent categories. 'These were the same as 
in Study 1 (see Table 1 and the display on this page). 

Analysis. As part of the procedure the interviewers 
coded the incidents themselves, but since the inter- 
viewers (like the interviewees in Study 1) often made 
obvious errors in classification, these data were not 
used. The interviewers did not know anything about 
the Herzberg theory at the time they gathered the 
data. Tu 

The protocols were later coded by a junior high 
school teacher with an MA degree in history and 
Cose ag was unacqua nted with the Herzberg 

y. present writer coded the stories inde- 
pendently. In each case, one event classification and 
one agent classification were made for each story 


specially trained coders (who did not see the em- 
ployee ratings), the findings were highly similar. This 
could not, of course, have been known in advance. 
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TABLE 1 
Goop axp Bap EvENT CATEGORIES 


i Good events 


Bad events 


1. Task activity—enjoyed the work task or task activity 


itself (regardless of external rewards or outcomes), 


portant, significant, or meaningful, 


too much nor too little; work Was easy to do (no 
specific success involved), 
3. Smoothness—york went smoothly 
E rary) distraction or interruption; worl 
t —. (but no Specific success involved). 


without (tempo- 
k done efficiently 


3. Success (work achievement in relation to some 

. Standard)—finished a task; completed an assignment 
. 9r project; solved a problem; reached a work 
-a deadline; did a job especially well or fast or skill- 
accepted by others ; saw ultim 
ting a contract (if Success was most salien t); reaching 


, tee 
N à sales figure if jt represents a standard of achieve- 
ment, 


5. Promotion—to a hi, 
Promotion, 


6. Responsibilit /—was increased; given special 


i assign- 
ment (not neci ssarily promoted), 


i 
] 


7. Verbal (or in blied verbal) recognition of w 
thanked, complimented, given credit 
special recognition for a piece of w. j 
ance in genera] (by company. 
subordinates, Cte); given high 


given award, or 
ork or for perform- 
» Supervisor, Co-workers, 
1 rating for Work. 


8. M. one y—recej 
made a. profit; 
of a raise; gettin 
salient; sce also #4) 


Talse or bonus or ti 


: P; 
overtime work $ 


9. Interpersonal alos phere—j 
everyone was getting along 
friendly, interesting Con versati, 


interaction with others (e-g., office party); py, 


nonwork action. (Note; 
for work; see # 7). 


2. Amount of work—amount of work just right; neither 


F fully; improved performance; hada project or solution 
ate success of Work; get- 


‘ork—praised, 


Given a desired task assignment, Saw the work as im- | 


goal; met| 


Task activity—did not enjoy or disliked the work or task 
activity itself (regardless of external rewards or out- 
comes). Given a disliked or undesired task assign ment, 
(e.g., a dirty job). Saw the work as unimportant, in- 
significant, or meaningless, x 


Amount of work—not reasonable; too much or too little; 
work v especially hard or difficult (no specific failure 
involved). 


Smoothness—work did not go smoothly. Temporary 
interruptions or distractions; wasted time; work done 
inefficiently, (Note: if actual failure, use #4 below), 


Failure (work failure in relation to some standard) — 
did not finish a task; did not complete an assignment or 
project; did not solve a problem; failed to reach awork 
goal; did not make a deadline; did a job especially 
poorly or slowly or unskillfully; failed to improve per- 
formance or did worse than before; had a project or 
solution rejected by others; saw ultimate failure’ of 
work because not used or results of work damaged or 
destroyed; failed to get contract or reach sales figure 
(if failure was most salient); caused an accident (see 
also #11), 


Demotion or lack of promotion—did not get a desired 
Promotion or promised promotion; no opportunity for 
Promotion (blocked opportunity). 


Res ponsibility—was not increased as desired or as 
Promised ; did not get special assignment that wanted 
to get; too much responsibility; given responsibility 
without adequate training; reduction of responsibility ; 
unclear responsibility, 


Negative verbal (or implied verbal) recognition (or lack of 
recognition) for work— criticized, blamed, not thanked, 
not complimented, not given credit or credit stolen by 
another, not given award, given reprimand, insulted for 
a piece of work or for performance in general (by com- 
pany, supervisor, co-workers, subordinates) ; false ac- 
cusation; given low rating for work; gesture or look of 
disapproval; complaint about product or work, 


Money—did not receive a desired raise or promised 
money bonus; did not make a profit; no overtime pay; 
no tip or low lip; salary or raise unfair (compared to 
others) ; failed to get contract or sale (if money was 
Most salient; see also #4). 


Inter personal almos phere—in general was unpleasant; 
€veryone was getting along poorly, hostile, un friendly, 
touchy, etc.; obscene language used in presence; dull 
conversation ; unpleasant nonwork interaction with 
others; criticized for nonwork action. (Note: do not 
Use this if criticized for work; see #7), 
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Table 1—(Continued) 


Good events 


Bad events 


10. Physical wo: 
perature, humidity, air, machinery, hours of work, 
location, physical surroundings of work, etc. 


11. Uncodable or other—for example, outcome of union 
election; etc. ; a ; 


king conditions pleasant—Nweather, tem- | 


Physical working conditions unpleasant —weather, tem- 
perature, humidity, air, machinery, hours of work, 
location, failure to get desired time off; physical sur- 
roundings of work, etc. 


Uncodable or other—for example, outcome of union 
election; accident (that does not belong in #4 above) 
etc. 


based on what appeared to be the most important 
factor. There was initial agreement on 84% of the 
events and 88% of the agents. All except five of the 
disagreements were resolved by discussion,* and the 
remaining five protocols were discarded. 


RESULTS 
Good versus Bad Incidents 


Events. 'The frequencies for the good and 
bad event categories for Studies 1 and 2 are 
shown in Tables 2 and 3, respectively. The 
data are shown separately for white- and blue- 
collar employees. The first and second inci- 
dents in Study 1 were combined (i.e., treated, 
in effect, as independent samples). 

The categories were combined in two dif- 
ferent ways in order to parallel the combina- 
tions used by Schneider and Locke (1971). 
First, a motivator/hygiene dichotomy was 
formed on each set of data by combining 
event Categories 1-7 (motivator) and 8-10 
(hygiene). Second, a task/nontask dichotomy 
was formed by combining Categories 1-4 and 
5-10 (see Table 1). The frequencies derived 
from these combinations are shown in the 
columns to the right of each total frequency 
column. 

These dichotomized data were then com- 
bined to form 2 X 2 contingency tables (e£ 
good/bad vs. motivator/hygiene for white 
collar). The results of the chi-square tests nm 


4Tt is doubtful that this discussion. biased the 
e the disagreements in 75% of the remain- 
ing cases were over categories within the mophivator 
or hygiene dichotomy (as dene ae bes new 
category system). The changes mal le by he uve 
rater from her initial ratings yielde a “net loss” for 
the Herzberg theory of 3 incidents (e¢., rea 
from a hygiene to a motivator C d a dis- 
satisfying incident or from a motivator to a hygiene 


category for a satisfying incident). 


results sinc! 


on these tables are shown at the bottom of 
Tables 2 and 3. (The Yates correction for 
continuity was applied in all cases.) In no 
case did any of the tests reach statistical sig- 
nificance. In other words, there was no evi- 
dence that good (satisfying) incidents were 
produced by different classes of events than 
bad (dissatisfying) incidents. In all of the 
samples the majority of both good and bad 
events were produced by motivator factors. 
On the average, about half of the white-collar 
events were task events, while the figure was 
closer to 30% for blue-collar employees. 
(White-collar-blue-collar differences are dis- 
cussed further below.) 

Agents. The Agent data were analyzed in 
terms of a self/nonself (Categories 1/2-8) 
dichotomy. (See display on p. 69.) In all sam- 
ples it was found that employees were signifi- 
cantly more likely to take credit for good 
events and to blame others for bad events 
(ps all < .01, chi-square test; these data are 
not shown). 


White Collar versus Blue Collar 


Events. In order to make more precise com- 
parisons with previous studies possible, the 
frequency data for events were combined ac- 
cording to both a motivator/hygiene dichot- 
omy and in a manner parallel to Friedlander's 
categories. Categories 1-4 were combined to 
form a task (intrinsic, achievement etc.) 
category, Categories 5-8 were combined to 
form a reward (advancement, recognition 
etc.) , Category, and Categories 9-10 were 
combined to form a (social-physical) context 
category. In these analyses, the good and bad 
events were combined, that is, the good and 
bad events were treated, in effect, as inde- 
pendent samples. 
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LAR AND BLUE-COL 
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LAR EMPLOYEES (Goop AND 


INED) FOR Stupigs 1 AND 2 
= a n — —— 
Study 1 Study 2 
Event categories un cm c eec ec] ERR — 
White collar Blue collar had White collar Blue collar ge 
Task (1-4) 90 32 18.91** 85 46 1.79 (ns) 
Reward (5-8) 40 65 9.71* 88 74 1.14 (ns) 
Context (9-10) i4 27 5.56* 15 2 «1 (ns) 
et oe M d A : m “ee 
* P <.05, 
** p £i 
The results for the task/reward /context cantly more self-agents and fewer nonself- 
breakdown are shown in Table 4, Separately 
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Separately (using expected 
on the overall marginals) , 
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SATISFIERS AND DissATISFIERS AMONG EMPLOYEES 


these results. The employees in the latter 
study were asked to describe a satisfying or 
dissatisfying day on the job, whereas in this 
study (and in Herzberg et al., 1959) they 
were asked to describe a time when they felt 
especially satisfied or dissatisfied. The former 
question would seem to imply incidents with 
a shorter duration than the latter. Further- 
more, in Schneider and Locke's study the em- 
ployees’ own classifications of the incidents 
were used in the analysis, while in this study 
the classifications of naive raters were used 
(except in the case of agents in Study 1). 
Despite these differences, the same basic re- 
sults were obtained in both studies. 

It should be noted that the existence of 
similar events results for good and bad inci- 
dents for the various combined categories 
( motivator/hygiene, etc.) does not mean that 
the patterns were identical for the individual 
categories. For example, it can be seen in 
Tables 2 and 3 that the amount of work and 
smoothness categories were cited more often 
for bad than for good incidents, while the 
opposite was true for achievement. Responsi- 
bility was basically a satisfier, while recogni- 
tion was mentioned frequently as a cause of 
both good and bad days. Physical and inter- 
personal atmosphere were cited more often 
for bad than for good incidents, whereas 
many categories tended to show the opposite 
pattern. : ) 

]t should be added that Herzberg's moti- 
vator/hygiene dichotomy (even with the new 
classification scheme used here) may not be 
the most useful or revealing way to analyze 
the data. The task/reward /context trichot- 
omy, which is more in line with Friedlander's 
(1965) three-factor approach, may bea more 
logical method to use since it groups categories 
which have more in common than is the case 
with Herzberg's breakdown. Herzberg et al. 
(1959) claim that “job satisfiers deal ls the 
factors involved in doing the job [p. 82], but 
this is somewhat misleading since his moti- 
vator factors include both factors in the work 
itself and rewards for task performance (rec- 
ognition, etc.). With larger a a. 
by-category comparisons might 1 e preferable 

*nati f categories. 
to any combinations © 
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The results obtained in these studies for 
agents also replicated the findings of 
Schneider and Locke (1971). It was found 
that people tend to take credit for good events 
and to blame others for bad events. It should 
be added that Herzberg et al. (1959) found 
the same thing in their original study: “For 
the low [dissatisfying] sequences the atti- 
tudinal effects are referred away from the 
person himself [p. 94].” In view of this 
statement, it is puzzling that Herzberg did 
not see how his use of a classification system 
which mixed events and agents together could 
bias the results. 

The results for agents can be attributed 
both (a) to defensiveness on the part of the 
employees and (5) to the fact that the oppor- 
tunities for other persons to cause dissatisfac- 
tion are greater than their opportunities to 
cause satisfaction. For example, the individual 
himself must be an agent in task accomplish- 
ment, but task accomplishment can be blocked 
by forces totally beyond his control (see 
Schneider & Locke, 1971). What part of the 
agent results are due to each of these causes 
cannot be determined without a detailed case 
study of each individual. 

The results for the white-collar-blue-collar 
comparisons were inconsistent. The frequency 
results for Study 1 were clearly congruent 
with the findings of previous studies, which 
showed that white-collar employees placed 
more importance on task factors and less im- 
portance on reward and/or context factors 
than blue-collar employees. The trend for the 
task data, and to a lesser extent for the re- 
ward data, was the same in Study 2 but was 
nonsignificant. 

It was also found in Study 1 (but not in 
Study 2) that white-collar employees cited the 
self as an agent more often than blue-collar 
employees. This is congruent with the event 
results, since task factors such as achievement 
are typically more in the individual employ- 
ee's control than are reward or context factors. 

The main difference between the results of 
the two samples is the greater frequency of 
Category 7 (recognition) events for white- 
collar employees in the second study (see Ta- 
bles 2 and 3). This resulted, of course, in a 
larger total frequency for the reward cate- 
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gory in that sample (see Table 4). This dif- 
ference cannot be accounted for by differences 
in the age or sex distributions of the two sam- 
ples, since these were similar in both studies. 
Furthermore, a breakdown by sex in the sec- 
ond sample showed no greater tendency for 
females to cite recognition incidents than 
males. An examination of the occupational 
distributions of the two samples, however, did 
suggest a possible explanation for the differ- 
ences in results. In the Study 2 white-collar 
sample, nearly 60% of all recognition events 
were given by employees in five occupational 
categories: clerical/sales, secretary/reception- 
ist, officer (armed forces), personnel/social 
work, and teacher/principal. It should be 
noted that a key aspect of the jobs in these 
categories involved dealing with other people. 
Almost 25% of the sample consisted of 
people in these categories. In contrast, em- 
ployees in these five categories accounted for 
less than 6% of the total subjects in Study 1. 
i This finding has several implications. First, 
it suggests that satisfying and dissatisfying 
job incidents are not solely a reflection of 
human nature" as such, but that they also 

reflect differences in both the actual structure 
of jobs and people's experiences in different 
jobs. Second, it suggests that if future studies 
compare white- and blue-collar employees 
within specific occupational groups, more con- 
pont results will be found. In other words, 
_— S e of occupational dif- 
white-collar—blue-collar results, 
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these results are not as “robust” as the good 
versus bad event results. Finally, these data 
suggest that the critical incidents method 
might be used to uncover salient sources of 
dissatisfaction within various occupations, or- 
ganizations, or even within departments of a 
single organization. 
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Relations among the General Aptitude Test Battery (GATB) scales job. 
related behavior (supervisor's ratings) and actual production and Feud 
rates were examined for 76 coil winders in an overhead distribution trans. 


former plant. In 
ment activities, 


than one-third that were negative. 


the correlations between the rating variables and the 


om low 


terms of published strategies applied in selecti 

a tion and 2- 
the use of the GATB was found lacking in several a 
The correlations between the GATB scales ee 
efficiency variables were found to be fri 


and the rating, production, and 
to insignificant, including more 


The application. of published selecti 

. B] H "n " n ctio! 
strategies with GATB scales was ineffective in differentiating between good und 
poor coil winders in terms of ratings, production, and efficiency. Additionally. 


production and efficiency 


variables are at best only from moderate to insignificant. The results of the 


study indicate 
setting. 


The General Aptitude Test Battery 
(GATB) is one of the most widely used test 
batteries in the United States. The GATB 
was developed and published by the U.S. De- 
partment of Labor and the U.S. Training and 
Employment Services (undated) and has been 
used for a number of years by most, if not 
all, state training and employment services 
offices. The GATB consists of 12 tests which 
are grouped variously into nine scales as fol- 
lows: General Intelligence (G), Verbal Apti- 
tude (V), Numerical Aptitude (N), Spatial 
Aptitude (S), Form Perception (P), Clerical 
Perception Coordination (K), Finger 
Dexterity (F); and Manual Dexterity (M): 
the first seven of these scales are composed 
of paper-and-pencil tests while the last two 
represent performance measures. 

A person looking for a job may take the 
GATB at a local state training and employ- 
et 
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a need for further evaluation of the GATB in the industrial 


ment services office, and the office personnel 
may then report, not the scores themselves 
but rather indications of scoring which take 
into account the standard error of measure- 
ment on certain scales of the GATB to assist 
companies in personnel selection and place- 
ment activities. The scale score information 
depends on the job and the selection of scales 
is based on validity studies (e.g., U.S. De- 
partment of Labor, 1965b, 1966; U.S. Train- 
ing and Employment Services, 1963) in which 
an attempt is usually made to identify the job 
in the Dictionary of Occupational Titles 
(DOT; U.S. Department of Labor, 1965a) 
Indeed, various numbers of GATB scales are 
published as specific aptitude measures aps 
propriate for jobs together with minimum 
x standards in accordance with DOT 
rer T ai U.S. Department of Labor, un- 
Ped mee has been used in many voca- 
tio al-educational studies. Samuelson (1956), 
z= Instance; reported multiple correlations 
E ging from .51 to .83 using these GATB 
scales to predict instructor ratings in courses 
such as auto mechanics and electronics, while 
Ingersoll and Peters (1966) found a multiple 
correlation of .62 in the prediction of me- 
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chanical drawing grades for ninth- and tenth- 
grade students. Droege's (1965) results, where 
the multiple correlation between instructors! 
ratings for ninth-grade vocational printing 
students and four GATB scales was only .38, 
are not so promising. Impellitteri and Kopes 
(1969), however, lament the lack of studies 


in an industrial setting available in the litera- 
ture. 


The purpose of the present study is to in- 
vestigate the relationships among the GATB 
scales with ratings by first-line supervisors 
and actual production and efficiency rates for 
experienced workers in a production plant. 
The widespread use of the GATB for em- 
ployment and placement purposes throughout 
the United States, together with the lack of 
industrial studies available in the literature, 
provides a crucial position for the present 
study, Moreover, the present study includes 
actual job output data, while many studies of 
more specialized distribution (eg, DES. De. 
U.S. Train- 
63) contain 


as criterion 
GATB. The 


tion and efficie 


ncy rates 
duction setting. 


* Southeastern United 
mployees. The 


ial assignments 
c] : : ^ 
The ratings uded in the study, 


Of first.]i 
e aments with regard to eight job-related fact 

as follows: (1) quantity of producti ASSAIS 
of production, t: 
(5) ability to learn, (6 
pendability, and (8) co 
tion of each item was 
rated their coil winde 
each item, ire m low 


knowledge, initiative, 
) attention to duty, (7) qe. 
qu AMOR: A detailed explana- 

en to the Supervisors who 


IS using a 5. 
: ?-point sc; 
to high cale for 


2 1 ~ . and from à » 
with intermediate midpoint Véluás £o E to five 
etc.). The total score was a sum of the item i 


ee 1 7 
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The rating scales have been used manv times by 
these same supervisors? 

Since the GATB was administered under con- 
trolled conditions, scores on all nine GATB (Form 
B-1002) scales were obtained from the local state 
training and employment services office for the sam- 
ple under study. 

The production and efficiency data were taken 
from daily records maintained by the company and 
reflect the proportion of units actually produced com- 
pared to an expected standard established by indus- 
trial engineering studies. The production data rep- 
resent that proportion over the entire work day, 
while the computation of efficiency takes into con- 
sideration the amount of time the coil winder is 


legitimately out of production (e.g, for machine 
malfunction, incorrect winding information, etc.) .5 


The daily rates were averaged over a 1-month pe- 
riod for each coil winder in the sample, and these 
averages constituted the data used for analvsis in 
the present studv. 

There were missing data in the study. Two persons 
had no ratings, and one had no GATB scores what- 
ever; five lacked GATRB F and M scores, and four 
others lacked five or six GATB scale scores of all 
types. These missing data were replaced with re- 
gression estimates from all other variables for those 
with complete data. 


RESULTS 


The intercorrelations, means, standard de- 
viations, and score ranges for all variables are 
presented in Table 1. Of the 28 item correla- 
tions in the rating scale, 21 are significant be- 
yond the .01 level and are positive from low 
to moderate, indicating good internal consis- 
tency but suggesting independence of item in- 


? In a small rate/rerate study with seven super- 
visors each rating two of their winders after a one- 
week interval, the correlation of the total scores Was 
.96. 

* Individual GATB scores were obtained through 
the courtesy of the local state training and employ- 
ment services office solely for analysis in the i 


present 
study. 

?For instance, if the standard studies indicate 
that three coils of a particular type should be pro- 


duced in 1 hour, then the total stand 
tion over an 8-hour day Would be 24 ¢ 
winder works full time on the produc 
produces 20 coils, his production 
are both 20/24 —.85. If the machine had broken 
down and he had been out Of production for 1 
hour but still produced 20 coils, however, then the 
expected output is only 21 coils (ie, 7 hours times 
three per hour) and his production rate remains 85, 
but his efficiency rate is 20/21 or approxim 
for that same day. For a production of 22 
Production rate would be 22/24 = 94, 
ciency rate would be 22/21 = 1.05, 


ard produc- 
Oils. If a coil 
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formation, While Part-whole Correlations are 
sometimes difficult to interpret (e.g., Bashaw 
& Anderson, 1967), all of the item-total cor- 


Sets of Paper-and-pencil 

again, there js 
Some part~whole type of problem because G 
1s made up of V, part of N, and part of S, The 
and standard deviations com- 
pare favorably with those in other published 
Training and Employment 
m except that the M mean 
h. Further, eyen though the 


= Suggests that t 
B about 540% of ; i i 
Duis ed o its Production Capacity 


a 
standard deviations an 
& reasonable amount of 
in these Variables 


diction purposes. Fur 
that th 


day for production but 
excusable time out of producti 
putation of efficiency). 
In Table 1 there appear to be relatively Jo 
correlations between the GA B scales E tha 
e 


lesser time for 
on for the com- 


D Is Rousu, 


AND J. E. McCrary 


other variables used in 
GATB 


cated in previous discussion, the lack of corre- 
lation does Dot appear to be due to inadequate 
individual Variation, truncation in the score 
to curvature in the 
bivariate data. (In regard to the latter, a cur- 
sory examination of all 190 plots did not re- 
veal any marked curvature at all.) It appears 
simply that GATB scores are not predictive of 
either job-related behavior in the plant or 
actual rates of production and efficiency in 
coil winding, 

The above results Stand in sharp contrast 
to the results reported by the U.S, Training 
and Employment Services (1963) in another 
coil winding study where six rating total and 
GATB scale correlations are significant at the 
:01 level, and the remaining three 
icant at the .05 level, 


four of the scales with P — 80, Q= 90, F = 
85, and M = g5 and support this Strategy by 
reanalyzing their data, dichotomizing for qual- 
ifiers/nonqualifiers on the GATB and for good 
and poor workers 

thereby obtaining a 
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many instances (e.g., Quereshi, 1971, p. 226). 
Nevertheless, adopting the U.S. Training and 
Employment Services strategy we find that 
24 (31%) of the 76 coil winders in our study 
would have failed to qualify in one or more 
of the four scales, approximately the same 
percentage (namely, 34%) as that reported in 
the U.S. Training and Employment Services 
(1963) study. The mean total rating score for 
the nonqualifiers is 27.02, while that for 
the qualifiers is 27.46, and a / test on the 
difference results in ¢=.60, p> .05. The 
average production rate for the nonqualifiers 
is .8538, while that for the qualifiers is .8585, 
and the test for difference is £ = .29, p > .05; 
a similar test of significance for the efficiency 
averages of .9097 and .9116 for nonqualifiers 
and qualifiers, respectively, results in ¢ = .15, 
p.05. The corresponding point-biserial cor- 
relations (e.g., Joe & Anderson, 1969) are too 
small to be meaningful. Using the U.S. Train- 
ing and Employment Services suggested strat- 
egy, then, no differentiation seems evident be- 
tween qualifying and nonqualifying selected 
coil winders on the basis of job-related behav- 
ior or production and efficiency rates. 

From the multiple regression approach (eg 
Anderson & Fruchter, 1960) using all nine 
GATB scales, the multiple correlation predict- 
ing production is .32 (F= S84, CFA A 
p> 03), while that for predicting dw 
is 39 (F=131, df — 9/66, p> 05). A: 
was noted previously, only one of the 18 mie 
order correlations between the GATB sca 5 
and the production and efficiency eee 
barely significant at the .05 level (nam i 
the .23 for Q and efficiency), and it apik = 
that not even suppressor type effects de 
Anderson & Fruchter, 1970) are to be gaine 
by inserting all nine scales in the prediction 


process. 


DISCUSSION 


in ar t consistent with 

The results herein are no ste 
those published by the U.S. Training ant 
Employment Services (1963) for the job oi 
coil winding. There may be, however, severa 


he discrepancies. 1 MI 
The bili in the two studies differ in 
several respects. The US. on ad Em- 
ployment Services sample consisted o e- 


males working in three different plants, while 
the sample for the present study was 76 coil 
winders (approximately 20% female) all 
working in the same plant. No sex differences 
in winding standards are maintained in the 
plant in the study herein, and the job de- 
scription given by the U.S. Training and 
Employment Services (1963, p. 3) would seem 
to cover, to a large extent, the winding job 
involved in the present study. 

Another difference in the two studies resides 
in the criterion variable. The U.S. Training 
and Employment Services study, as well as 
others (e.g, U.S. Department of Labor, 
1965b, 1966), contains supervisory ratings as 
the criterion variable, while the central crite- 
ria herein are the actual production and effi- 
ciency rates of the winders, although ratings 
were obtained from supervisors. Certainly, 
job-related behavior is important, but the 
low-to-moderate correlations between the rat- 
ings and the production and efficiency rates 
would seem to engender suspicion of the ad- 
visability of using ratings as the sole criterion. 

The results of the present study indicate 
that further verification is needed regarding 
the utility of the GATB as a selection device 
in the industrial setting, particularly in view 
of the very wide use of the scales and stan- 
dards in the state training and employment 
services offices throughout the United States. 
There is a need for further industrial studies 
that include not only rating criteria but other 
job performance criteria as well. Further, in 
view of the wide application of standards, we 
would strongly recommend samples much 
larger than the ones of from 50 to 70. Al- 
though the sample of 76 used in the present 
study is only about 15% larger than that for 
the U.S. Training and Employment Services 
(1963) standardization study, it should be 
viewed as laying some foundation for further 
research, 
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SELF-ESTEEM AS A MODERATOR OF THE RELATIONSHIP 
BETWEEN EXPECTANCIES AND JOB PERFORMANCE 


JAMES F. GAVIN ! 


Colorado State University 


This study examines the implication of Korman's consistency hypothesis for 
predictions of work behavior, derived from Porter and Lawler's expectancy 
model, and evaluates the feasibility of moderating the expectancy-performance 
relationship with relevant variables. Three hundred and sixty-seven male and 
female managerial-level employees were subgrouped on the basis of self-esteem 
scores, and correlations between expectancies and job performance were com- 


puted for each subgroup. In 16 


out of 22 comparisons, the correlations for the 


high-self-esteem groups were higher. However, only five of the differences 


we 
pothesis. Overall, the 


re significant, thereby providing equivocal support for the consistency hy- 
study indicates that moderator variables may be relevant 


to predictions of performance with expectancy measures. 


The *path-goal" model of work behavior, 
as expressed originally by Georgopoulos, Ma- 
honey, and Jones (1957), later on by Vroom 
(1964), and more recently by Porter and 
Lawler (1968) and Graen (1969), presents 
man as goal oriented, need satisfying, and as 
one who interacts with his environment in a 
“rational” manner (Wanous, 1972). Such a 
model suggests that incentives and rewards 
will have similar influences on behavior for 
all workers, provided these incentives have the 
same values and subjective probabilities for 
the workers. However, the findings of Blumen- 
feld (1965) and Schletzer (1966) suggest that 
this model may not be a very accurate one. 
Blumenfeld, for example, hypothesized that 
for individuals who had recently reenlisted in 


the Navy, there would be a high correlation 
between their own values and the values that 
rted by the Navy. 


they expected to be suppo Nav) 
His correlations, however, Were negative p: 
direction. Schletzer, on the other hand, foun 


no relationship between scores on the Strong 
s and satisfac- 


Vocational Interest Blank scale. 
tion with one’s occupation for six separate 
occupational groups. Such evidence would 
not appear to support the notion of man as 
goal oriented and need satisfying. 

, An alternate view of man has been offered 
by Korman (1966, 1967, 1970) in a series of 
recent papers. Korman (1970) suggests that 


1 Requests for repri 
Gavin, Department 
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ints should be sent to James F. 
of psychology. Colorado State 
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man is “consistent,” not self-enhancing, and 
that “. . . in the industrial situation incentives 
as motivators of performance are circum- 
scribed in their effects by their perceived ap- 
propriateness for the individval |p. 85]: 
Korman's (1970) basic hypothesis is that 


Individuals will be motivated to perform on a task 
or job in a manner which is consistent with the self- 
image with which they approach the task or job 
situation. That is, to the extent that their self-concept 
concerning the job or task situation requires effec- 
tive performance in order to result in “consistent” 
cognitions, then, to that extent, they will be moti- 
vated to engage in effective performance [p. 32]. 


Tn this light, it can be argued that a rational 
self-interest model should hold only for those 
individuals who have high self-esteem, where 
self-esteem is defined as one's general evalua- 
tion of oneself as a need-satisfying, adequate 
individual. Evidence supporting a "consis- 
tency" hypothesis has been summarized by 
Korman (1971, p. 46). A 

The purpose of this study was to examine 
the implications of Korman’s hypothesis as 
they relate to predictions of work behavior 
derived from Porter and Lawler's (1968) ex- 
pectancy model. Tt was hypothesized that the 
relationship between reward expectancies and 
job performance would be supported for high- 
self-esteem individuals only. However, in a 
more general sense, this study attempted to 
ascertain the value of moderating the relation- 
ship between expectancies and performance 
with relevant variables. 


Ses n 


. extremely desirable) on which to } 


. Ones listed in the values 
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METHOD 
Subjects 


Participants were 192 male and 175 female em- 
ployees, broadly classified as “managerial candidates,” 
who were on the home office staff of a large (N = 
approximately 18,000) northeastern insurance com- 
pany. Subjects were selected on a representative 
sampling basis from the management trainee pro- 
gram. Only 33 of the 400 selected were unable or, in 
a few cases, unwilling to take part. The work ac- 
tivities of the participants included systems and 
programming, actuarial and underwriting, mortgages 
and investments, and personnel. At the time of em- 
ployment, all of the subjects had a minimum edu- 
cational level of college graduation. However, more 
males than females in this sample had degrees be- 
yond the baccalaureate. The average age of the 
males was 29 and of the females, 24, Accordingly, 
males tended to have greater tenure on the job than 
females, with an average of 54.9 months of service 
for males versus 22.5 months for females. 


Measures 


As part of a larger study (Gavin, 1969) on the 
Porter-Lawler modcl, measures of expectancy, self- 
esteem, and job performance were collected. 

Expectancies. To operationalize this construct, the 
type of measure used by Lawler and Porter (1967) 
was chosen. The design of this measure is as follows: 

Part 1. Values—Subjects are presented with a 
9-point scale (ranging from extremely undesirable to 
base their judg- 
rewards, There are 
g such things as high 
feelings of security, and 


ments about the value of certain 
21 rewards listed, includin: 
pay, giving help to others, 
promotion, 

Part 2, Expectancies— Subjects are presented with 
Ee scale, this time with 7 scale points (ranginz 
eed z always) on which to base their judg- 
had garding the frequency with which the jon 

aviors of working hard and good job performance 


lead to certain rewards, The rewards are the same 


most rewards, 
frequently does 


(a) How 
z hard lead to this reward? 


effort- 
a given - nt iion 
] From the expectancy and values E. 

tionnaire, two scores were derived for each aes 
(a) unweighted expectancy—the sum of th p 
pectancies (i.e. all items in Part 2) and (b) w iror 
expectancy—the sum of the expectancies en 
by their respective values (from Part 1) Msn 


The reliability coefficients — (Kuder- Richardson 
Formula 20) for there asures were, respec- 
tively, .S8 and .91 for males and .89 and .89 for 


females. Additional information on the measures has 
been provided by Gavin (1969) 

Self-esteem. As in Korman's (1966, 1967, 1970) 
works, the measure of self-esteem used in this re- 
search was Ghiselli’s (undated) Scli-Assurance Scale. 
This instrument contains 31 forced-choice adjective 
pairs and, according to its author, measures “. . . 
the extent to which the individual perceives himself 
as being effective in dealing with the problems that 
confront him [p. 16]." Additional data on the Seli- 
Assurance Scale are provided in Ghiselli (1971). 

First performance rating. A modified Likert-type 
(cf. Edwards, 1957) rating procedure was devised 
for this study. The rating form contained nine items 
on which the supervisor was to evaluate such fac- 
tors as quality of work, quantity of work, and over- 
all job performance. The nine items, which were 
measured on 9-point numerical scales, were then 
summed to obtain a total performance scores. Kuder- 
Richardson Formula 20 reliabilities for this measure 
reached acceptable levels for both males (.85) and 
females (.95). 

The first set of ratings was obtained within 2 
weeks after the administration of the research in- 
struments. 

Second performance rating. One year after the in- 
struments had been administered, an overall per- 
formance rating on a 4-point scale was obtained 
from company records. This second evaluation, 
which was made by the immediate supervisors, re- 
sulted from the company’s annual appraisal pro- 
gram, whereas the first set of ratings had been 
developed for research purposes only. The correla- 
tion of this rating with the research rating was .55. 


Procedure 


Research instruments were administered during 
company time, in large testing rooms, by a repre- 
sentative of an independent trade agency (Life 
Office Management Association). At the end of each 
session (approximately 14 hours), subjects sealed 
their instruments in envelopes and handed them to 
the researcher. The first set of performance ratings 
was kept confidential by having supervisors mail 
them directly to the researcher. The second set of 
ratings was collected from company records. 

Subjects were divided into two groups on the basis 
of scores on the self-esteem measure. The distribu- 
tion of scores was found to be normal for males and 
females, so that the average was used as the dividing 
point. Scores above 27 for males and above 24 for 


females were considered to be indicative of high self- 
esteem, 


RESULTS 


The data were analyzed for the total sam- 
ple, as well as by sex, job level, educational 
level, and department. A comparison of means 
and standard deviations between the high- 
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and low-self-esteem samples yielded no sig- 
nificant differences on either the expectancy 
measures or the ratings. However, there was a 
slight but nonsignificant tendency for males to 
have higher performance ratings and expec- 
tancy scores. Also, significant correlations 
were found between self-esteem and both job 
level (r = .19, p < .05) and educational level 
(r = 14, p < .02). These correlations, how- 
ever, were reduced to nonsignificant values 
when sex was controlled for. 

Males and females were classified as high 
or low self-esteem, depending upon their re- 
spective cutoff scores for the self-esteem mea- 
sure. It should also be mentioned that due to 
the similarity of correlational values for the 
unweighted and weighted expectancy mea- 
sures, only the former are reported. 

As can be seen in Table 1, the predicted 
moderating effect of self-esteem was supported 
for the total sample when the second perfor- 
mance criterion is considered. The p level in 
this and subsequent tables refers to the prob- 
ability that the two correlations (for high 
self-esteem and low self-esteem) were drawn 
from populations with the same true value of 
rho (cf. Ferguson, 1966). When the sample 
was subdivided by sex (see Table 2), none 
of the differences between high- and low-self- 
esteem groups reached significance, although 
all four were in the predicted direction. Sub- 
dividing the sample on the basis of educa- 
tional level (see Table 3) resulted in signifi- 
cant (p = .02) or near significant (b = .07) 
values for the college graduate groups, but 
essentially no differences for the groups with 
degrees beyond the baccalaureate. 


The next set of comparisons involved sub- 


TABLE 1 


CORRELATIONS BETWEEN EXPECTANCIES AND PERFORM- 
ANCE CRITERIA FOR ToraL Hicu (HSE) AND 
Low (LSE) SELF-ESTEEM SAMPLES 


Total sample 


Expectancy | 


rating | HsE | LSE | a 
| (w = 183) | (= 184) | Ple 
First | a | om 4A 
Second | BOF | 10 | 05 
*p «.01. = 


grouping the sample on the basis of job level 
(see Table 4). The lower level group consisted 
of subjects in high-level clerical functions or — 
trainee positions. For the most part, those in 
the middle-level group were in first-line super- 
visory or technical staff positions. The highest 
level group was comprised by middle-level 
managers and some professionals (e.g., senior 
systems analysts, security analysts). The cor- 
relations for these groups indicate an inter- 
action between self-esteem and job level, in 
that the prediction was supported most 
strongly for the upper-level and the lower-  - 
level groups, and the correlations were in the 
predicted direction. The middle-level groups, | 
however, evidenced nonsignificant correlations | 
in a direction opposite to that predicted. 
Finally, the sample was subdivided on the 
basis of departmental affiliation (Table 5)3 | 
where the department was large enough to 
permit such analysis. The results for two of 
the three departments were in the predicted 
direction, although to a nonsignificant degree 
in all cases. ; 


TABLE 2 


CORRELATIONS BETWEEN 


ExpECTANCIES AND PERFORMANCE CRITERIA FoR Hic (HSE) Axp Low (LSE) 
EsrEEM SAMPLES SUBGROUPED BY SEX : 


Females Males 
Expectancy . 

ee HSE LSE HSE E | 

one - : SE LSE 

ra "ED (n — 87) gaa (n = 95) wagy | led 
- 3a -25* 33 3n 18 20 
p pm | 48 18 Gm | m n 
um o5. Í 


wep c.l. 
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TABLE 3 
CORRELATIONS BETWEEN ExPECTANCIES AND PERFOR 


Low (LSE) SELF-ESTEEM SAMPLES SUBGROUPED BY Epucatic 


MANCE CRITERIA FOR Hic (HSE) AND 


DNAL LEVEL 


| College degree | 


Advanced degree 


Expectancy - —]-— — — 
rating HSE LSE | lens HSE |  LSE | Bia 
(n = 120) | (n = 135) | 2 level (0258) | (n = 49) | Pieve 
First "m | o "ru .19* EN" 
Second Gt .20* .02 .00 .00 — 
* b « .05, En "d e i 
ED «01 
Discussion 


The findings of this E 


ear support for Korman's consistency hy- 
pothesis. Most of 


(e.g., Rosenberg 

nessee Department 
Concept Scale, C: 
ventory) might h 
than the Ghiselli 
other 
elf-esteem and subjects’ job 
and educational levels. These suggestions of 
Owever, seem to lend some 
general hypothesis of this 
utility of moderating the 
job expectancies and per- 


cation, July, 1971) 


work hard?” a 
esteem would no 
question is concep 


t 


phrased in a 
working hard?’ 
be expected. 


nce concern- this study 


ing its construct Vi 
Self-Este 


expectancy items are phrased 
expectancy of achievin 


tually similar to 
pectancy of success (iie; 
position. On the other ha 
more general 
"What is the expectancy of achieving X by 
°’), a moderating effect would 
Since the instrument involved in 
instructed subjects to indicate 


alidity. Other measures 
em Scale, Ten- 


of Mental Health Self- 


alifornia Psychological In- 
ave been more appropriate 
Self-Assurance Scale. An- 
methodological 

pointed out by Korman 


problem has been 
(personal communi- 
- He indicates that when 
"What is the 
8 X, if you, yourself, 


moderating effect of self- 


be expected, since this 
à high ex- 
high-self-esteem) 
nd, if items are 
manner (e.g., 


Mania), io Gell The Scli-Description Inventory "How often is it true for you personally that 
iu the first factor (hard work or good job perfor- 
Ce TABLE 4 
ORRELATIONS ? i 
ANENE BE PECTANCIES AND PERFORMANCE CRITERIA FOR Hien (HSE) AND 
Game moe E ESTEEN Saures SUNGRODPED a Ju EVEI a 
b level 
ne Jo 
Expectancy Lower " - 
rating Middle Upper 
HSE LSE MARRS Po e E e D aaen 
(n = 66) (n= 82) P level HSE | LSE HSE LSE 
(n = 37 = 39) | Ż level pisaga à p level 
—————À a = 
m "B&B Pagosa uM Q ) i (n = 43) | (x = 18) 
Second .28* ‘05 ee | 35* E AG 35* —.30 .02 
ee e U xu | 30 28 3T —.54* .001 
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TABLE 5 


CORRELATIONS BETWEEN EXPECTANCIE: 


D PERFORMANCE CRITERIA FOR Hicu (HSE) AND 


Low (LSE) SELF-ESTEEM SAMPLES SUBGROUPED BY DEPARTMENTAL ASSIGNMENT 


Department 
Expectancy Actuarial | Systems/programming Insurance 
rating 
| | 
HSE LSE " HSE | LSE HSE | LSE 
(n = 37) | (n = 32) plevel | y = 51) | (n = 52) | 2 evel | (n = 61) | (n= 63) | plevel 
First 26 3i 83 35 | a 40 | 3a | 09 48 
Second 14 21 78 a2 .22 .60 39th .00 .05 
E" p «.05. 
** p «0l. 


mance) leads to the second (some job re- 
ward) on your job?," the moderating effects 
of self-esteem should have been mitigated. 
Criterion problems may also have played a 
part in the failure to find clear support for the 
hypothesis, although this explanation appears 
to be weakened by the finding that the first 
and second ratings, which involved different 
instruments and response sets, still correlated 
highly with one another (7 = 55, p < 01). 
From a theoretical point of view, the con- 
sistency hypothesis appears to receive indirect 
support from studies of achievement motiva- 
tion, internal-external control of reinforce- 
ment, and risk taking. For example, Mahone 
(1960), in examining the vocational pref- 
erences of high school students as a function 
of achievement and fear-of-failure motiva- 
tion, found that subjects high in failure-avoid- 
ant motivation tended to choose occupations 
that were too difficult or too easy in terms of 
their ability. On the other hand, high need 
achievers were more likely to choose occupa- 
tions appropriate to their ability. Two other 
studies by Burnstein (1963) and Morris 
(1966) corroborate this pattern of “irra- 
tional? vocational choices among high fear-of- 
‘jure subjects. 
ins an application of Rotter's ( 1966) no- 
tion of internal-external control of reinforce- 
to racial differences in level of aspira- 
ment 1 risk-taking behavior, Lefcourt and 
bon Anh d 65) found that blacks, whose 
Ded [s $ f a belief in external 
SCOreF indicated more oh vas interpreted 
control, demonstrated what ped vim 
as an “irrational” pattern of risk-taking be- 


havior. Blacks, for example, showed signifi- 
cantly more increases in aspiration level fol- 
lowing failure and more decreases in aspira- 
tion level following success than their white 
counterparts. This pattern of behavior was 
viewed as consistent with a belief system 
wherein success and failure are not under 
one’s own control. 

Further, in their studies of decision making 
and risk taking, Kogan and Wallach (1967) 
found that certain types of individuals, as 
identified by personality measures, tended to 
use “adaptive” or “rational” strategies in risk- 
taking situations. For example, in Kogan and 
Wallach’s “final bet procedure,” an adaptive 
strategy for someone with minimal prior win- 
nings would be a risky bet, whereas for some- 
one with considerable prior winnings a con- 
servative final bet would be appropriate. 
Those with high motivational disturbance 
that is, high test anxiety and defensiveness 
were found to bet in an unadaptive mE 
disregarding the magnitude of their prior win- 
nings. Subjects low in motivational disturb- 
ance, on the other hand, were found to bet in 
a more adaptive or rational fashion. Kogan 
and Wallach (1967) comment that 3 


It looks . as if we hav i 

ks... as ave been able to pinpoint 
those individuals for whom the formulations of the 
mathematical model builders can be expected to 
find their clearest application [p. 211]. 


Another type of explanation related to mod- 
erating influences on expectancy-performance 
relationships is suggested by Locke (1968). In 
attempting to explain why subjects who are 
given hard goals may not work for their at- 
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tainment, Locke indicates that the issue con- 
cerns the difference between task assignment 
and goal acceptance. He states that when a 
hard task is accepted, “the only logical thing 
to do is to try one’s hardest until one decides 
to lower or abandon the goal [p. 168].” How- 
ever, people who stop trying when they are 
confronted by a hard task have not accepted 
the task assignment or have decided the goal 
is impossible to reach. In a job situation, it is 
possible that those subjects for whom the 
expectancy predictions do not hold may agree 
that performance is related to goal attainment, 
but may not have accepted the task or feel 
that the task is not possible for them. 

So, to sum up, it would appear that some 
strands of research point toward the utility of 
moderating influences on the expectancy-per- 
formance relationship and that a few of these 
research efforts tend to highlight the rational— 
irrational dichotomy which Korman discusses. 
The more important issue, however, is that 
given an employee who perceives rewards to 
be contingent upon performance, the goal- 
oriented behaviors of this employee are likely 
to be determined, in part, by certain person- 
ality factors. Obviously, a more extensive 
study is called for, both as a replication of 
this investigation and an extension of it, so as 
to include such other potential moderators as 


internal-external control, test anxiety, and 
achievement motivation, i 
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The purpose of this investigation was to determine if Schutz’s Fundamental 
Interpersonal Relations Orientation (FIRO) compatibility theory, based on 
interpersonal need satisfaction, would hold in a context which emphasized 


rational, nonpersonal processes. Subjects 
visory staff unit members. Two types 


ior the study were industrial super- 
of interpersonal compatibility were 


compared to two measures of interpersonal work eífectiveness and to a mea- 


sure of sociometric choice. Only 2 out 


of 20 operational hypotheses testing 


the theoretical postulate of compatibility were supported by the data. It was 


concluded that the theory did not hold 


The FIRO theory developed by Schutz 
(1958, 1966) is a theory of interpersonal 
behavior built around the concept of inter- 
personal need. The theory contends that inter- 
personal behavior is behavior aimed at satis- 
fying those internal needs that can only be 
satisfied through human interaction. Three 
such needs— inclusion, control, and affection 
—are held to be sufficient to explain and pre- 
dict interpersonal phenomena. The configura- 
tion of these needs within a person is said to 
n orientation toward human rela- 
acronym, FIRO, for Funda- 
mental Interpersonal Relations Orientation. 

One of the major theoretical postulates con- 
‘cerns the interpersonal construct—compati- 
bility—Nwhich is conceived as the goodness of 
fit between the need configurations of two or 
more individuals. It is postulated that the 
better the fit, the more likely it is that the 
individuals will attain the goal of their rela- 
tionship. 

A number of invest 
predicted peer 

ibilitv and a variety 
Peformance (Eisenthal, 1961; pints 
1958), student achievement (Hutel d 
learning climate (Powers, 1965), an 
as for friend (Sapolsky, 1964), as well 
oe tor ador and companion (Estadt, 1964; 
Schutz, 1958). 
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in the context studied. 


In the educational setting compatibility 
has been related to teacher attitudes (Obra- 
dovic, 1962). Vodacek (1961), however, con- 
cluded that the theory did not hold in the 
relationship between teacher compatibility 
and either teacher satisfaction or consensus 
regarding the role of principal. 

Sapolsky (1965) and Gross (1959) found 
compatibility to be positively related to 
therapeutic success in group psychotherapy, 
and Yalom and Rand (1966) found it related 
to therapy-group cohesion. In another study, 
Sapolsky (1960) showed that experimenters 
were better able to verbally condition sub- 
jects when the experimenter-subject compati- 
bility was high. In the love relationship 
Kerckhoff and Davis (1962) found greater 
courtship progress among more compatible 
couples. 

FIRO compatibility has failed to be posi- 
tively related to the expression of personal 
values in small groups (Eisenthal, 1961). In 
the school counseling relationship, Arndt 
(1968) failed to support hypotheses linking 
counselor—counselee compatibility with posi- 
tive counseling outcomes. Others, such as 
Hartsough (1964), who found compatibility 
unrelated to communication effectiveness in 
dyad problem solving, ascribed their failure 
to support the compatibility theory to meth- 
odological problems rather than to theory 
limitations. 1 

The above indicates that the FIRO com- 
patibility theory can explain a variety of in- 
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hich is relev; 


Variables, which 


express direction 
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5 the overal] inten, 
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Instruments 


Need direction, magnitudes, and levels are mea- 
sured by instruments developed by Schutz-FIRO-B 
(1957a) for the level of behavior and FIRO-F 
(1957b) for the level of feeling. Each instrument 
Contains six Guttman scales of nine items each. Both 
instruments have acceptable reproducibility levels of 
:90 or more, Test-retest reliabilities average 77 for 
FIRO-B and -72 for FIRO.F (Hutcherson, 1963, p. 
34; Schutz, 1958, p, 78). Validity for FIRO-B has 
been well established (e.g., Kramer, 1967; Schutz & 
Allen, 1966). The FIRO-F test is a more recent de- 
velopment and has less evidence of validity. Concur- 
rent validity evidence is Presented by the test pub- 
lisher (Schutz, 1967) and by Powers (1965). Evi- 
dence for its predictive validity js offered by Hutch- 
erson (1963). An investigation w 
study of the stability of compatibilities, 
compatibilities identified above were computed on 
260 dyads across a 4-week test-retest interval, Sta- 
bility at the leve] of compatibility was expected to 
be low since it incorporates the variations in each of 
12 scales (six scales Combined for each of the two 
members of ‘son correlations were 
(a) Origin behavior level 34, 
(b) intercl compatibility — behavior level = .60, 
(c) Originator compatibility — feeling leve] = 44, and 
(d) interchange compatibility — feeling level = 61, 
Results will require qualification due to such low 
reliabilities, 


Design 


Dependent variables were developed Írom the 
objective of testing the compatibility Postulate jn 
the work context by adhering as closely Possible 
to reality conditions, The FIRO variable—produc- 
tivity goal achievement—was translated to inter- 
personal work effectiveness, Which was measured jn 
two ways. The first assessed interpersonal Work in its 
natural field context, Members of Supervisory staff 
units rated their unit's subject dyads on à 9-point 
scale measuring the effectiveness of the dyad's inter- 
personal work, Each dyad was rated by five unit 
members whose ratings were summed to Constitute a 
dyad effectiveness score, This variable js termed work 
relationship effectiveness, The distribution Of these 
scores was cut at the median to form a high and 
low dichotomy, Compatibility ranks of the high 
category were compared to compatibility 


The Second method of assessing interpersonal 
Work effectiveness used dyads having natural work 
relationships, but exposed them to an experimenta] 
condition consisting of a simulated management task, 
The task required the dyads to establish priorities 
for severa] Work items and to make ADDropriate 
delegation decisions for these items, Priority and 
delegation decisions were scored against the judg. 
ments of a pane] of experts. Dyad scores we 


INTERPERSONAL COMPATIBILITY AND 


vided by the time they required to complete the 
task, yielding a productivity measure. This variable 
is termed task performance. The task performance 
score distribution was compared to the compatibility 
distributions by the Spearman rank correlation, cor- 
rected for tied ranks (Siegel, 1956, pp. 206-210). 

The FIRO  variable—continued personal inter- 
change—was translated to choice for work partner. 
Subjects within supervisory staff units were asked to 
choose, from among the unit's members, the persons 
they would most and least prefer as a co-worker on 
a critically important business task. Members choos- 
ing each other (mutual choices) for most and mem- 
bers choosing each other for least constituted the 
subject dyads. This variable is termed co-worker 
choice. Compatibility ranks of dvads in the most 
category were compared to compatibility ranks of 
dyads in the least category by the Mann-Whitney U 
test, corrected for tied ranks. 

The theoretical hypotheses, derived from the FIRO 
compatibility postulate and two of its theorems 
(Schutz, 1958, p. 198), are as follows: 


1. More compatible managerial dyads will have 
greater work relationship effectiveness than will less 
compatible managerial dyads. 

2. More compatible managerial dyads will attain 
higher task performance than will less compatible 
managerial dyads. 

3. The members of more compatible man gerial 
dyads are more likely to prefer each other as W ork 
partners than are the members of less compatible 
dyads. 

Each of the four compatibilities, serving as inde- 
pendent variables, were compared to each of the 
three dependent variables—work relationship effec- 
tiveness, task performance, and co-worker choice. 
For tests of work relationship effectiveness and k 
performance, two separate relationship conditions— 
authority and peer—were studied. Authority rela- 
tions consisted of boss-subordinate combinations, and 
peer relations consisted of subordinate-subordinate 


combinations. 

All 20 operational hyp 
binations of the independent and dependent va 
were predicted directionally consistent with the theo- 
retical hypotheses. Nonparametric statistical methods 
were chosen as an accommodation to the ordinal 
Guttman scales of the FIRO instruments. The sig- 
nificance level of .05 was established for all tests, 
and one-tailed tests were used throughout. 


potheses resulting from com- 
riables 


Subjects 
cts for the study were male industrial super- 
single company employing over 
f this company’s 50 manufac- 
VER < were selected on the basis of variety 
arine cde size, product ind Jocatian; 
At these 10 plants, all supervisory Hr ums Come 
sisting of a boss and those subordinates who epos 
directly to him, that met the following two criteria 
were identified: (a) The boss was located on either 


Subje 
visors drawn from a 


100,000 people. Ten © 
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the second or third supervisory level from the bot- 
tom of the organizational hierarchy. (b) The boss 
and at least four supervisory subordinates had been 
members of that staff unit for 1 or more years. The 
rationale for choosing lower supervisory levels was 
to use the organizational levels that deal most with 
interpersonal issues. A 1-year relationship length was 
required, based on evidence pointing to the possibil- 
ity that compatibility has effect only after consider- 
able acquaintanceship (Kerckhoff & Davis, 1962). 
These selection criteria identified 100 acceptable staff 
units. Through a series of random selection steps, 
voluntary subject dyads for the three dependent 
variable conditions were identified. For assessing 
work relationship effctiveness, 26 authority dyads 
and 52 peer dyads were each rated by the five mem- 
bers (boss and four subordinates) of their staff units. 
For assessing task performance, 33 authority dyads 
and 33 peer dyads completed the simulation task. 
Thirty-six dyads represented mutual choices for the 
co-worker choice variable (18 for most, 18 for least). 


RESULTS 
Work Relationship Effectiveness 

The hypotheses derived from the FIRO 
theory regarding compatibility and work rela- 
tionship effectiveness were not supported. Of 
the four hypotheses (derived through com- 
paring originator and interchange compati- 
bilities with both behavior and feeling levels) 
in the authority condition (boss-subordinate 
relations) only one, originator compatibility — 
feeling level, reached statistical significance 
(U = 50.5, p < .05), and it was in a direction 
opposite to that hypothesized. 

Three of the four compatibility/work rela- 
tionship effectiveness findings in the peer 
condition (subordinate-subordinate relation- 
ships) were not statistically significant. Origi- 
nator compatibility — feeling level was found 
to be significant (U = 225, p < .02) in the 
hypothesized direction. Since originator com- 
patibility was significantly related to work 
relationship effectiveness in both authority 
and peer conditions (although directionally 
opposed), some evidence exists that this type 
and evel of compatibility may be salient in 
the interpersonal work of the subjects. If so, 
it would be expected to appear again in the 
task performance criterion. j 


Task Perjormance 


While the interchange compatibility — be- 
havior level/task performance relationship 
reached an acceptable probability limit (7, = 
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.29, p < .05), the coefficient of determination 
for this magnitude of 7; indicates that only 
8% of the total variation is accounted for, 
Therefore, while chance can be rejected as 
accounting for the relationship, the degree of 


tested this variable in the Work context of the 
study. Only 2 of the 16 hypotheses were sup- 
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Hypotheses relating compatibility to co- 
worker choice were not Supported. Inter- 
change compatibility — behavior lev 


omy el nearly 
reached Significance (4 y of 110.5 compared 
to a required U or 109 or less), but the direc- 


lion was Opposite to prediction, 
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Seneralization, [ mselves, indicate any 


A € co ibis > 
with these hypotheses. Petibilities associated 


s » Originator ibil- 
Eu , Orig compatibil 
ity feeling level and interchange ¢ 3 ibil 
ity— behavior leve] ime mur 


zd Crosso 
patibility, levels of n "un 
conditions, Two ot í 
er Telationshi 

: A PS reach 
oF ne reached, a magnitude Of stati Bul 
significance, Originator Compatibilit: fe lin: 
level in the authority condition f i a 
tionship effectiveness i ep ele 
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tions, Since originator compatibility — feeling 
level and interchange compatibility — behavior 
level were the only compatibilities of signifi- 
Cant, or near significant, magnitude, it is 
tempting to consider them salient in the study. 
Such a conclusion, however, ignores the cross- 
overs and directional reversals. The only sub- 
stantial generalization that can safely be 
drawn from the results is that the FIRO the- 
ory of compatibility failed to be substantiated 
in the work context used in this study. 

The question to be answered, then, is why 
the theory failed to hold. One possibility, de- 
rived from the low test-retest compatibility 
Correlations, is that instrument failure rather 
than theory failure produced the results, A]. 
though it is typical for personality measures 
to have low Stability, the compatibility corre- 
lations, ranging from 44 to .61, are low 
enough to indicate serious qualification of any 
results attained from their use, 

Another possibility is suggested by the 
range of compatibility scores produced by the 
subject dyads. The possible range of com- 
patibility scores is from 0 to 54.2 The maxi- 
mum score distribution of the subject dyads 
for = 3 standard deviations ranged from 20 
to 54. The relative absence of very low com- 
Patibilities could indicate that only low com- 
patibility is related to the criteria studied. If 
so, this study woud not have tested such a 
possibility. 

Doubt about the FIRO theory’s applicabil- 
ity to a work context is also supported, be- 
cause the theory does not contain a work ori- 
entation construct, Bion (1961) contended 
that individuals were oriented toward work 
in addition to emotional orientations, The 
possibility exists that the construct of work 
orientation may have such Strong salience for 
industrial supervisors that Schutz’s interper- 
sonal need Constructs are diminished in im- 


Constructs to explain interpersonal behavior in 
work situations appears to be warranted, 
gc Edd 
? Schutz's compatibility formulas yield low scores 
for high compatibility, with a score of 54 represent- 
Ing least Compatibility. In the Present study, al] com- 
patibility Scores were subtracted from 54 to Yield 
igher Scores for greater compatibility, Ë 
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Finally, an attack could be mounted against 
the fundamental basis of the FIRO theory. 
The theory is psychoanalytically based, con- 
tending that interpersonal needs are acquired 
in childhood and are enduring characteristics 
deep within the personality. Argyris (1965) 
has argued a converse. He contends that inter- 
personal competence is learned and learnable 
throughout life. Competence, as Argyris uses 
it, may be interpreted as compatibility. Also, 
it is possible that interpersonal effectiveness 
is more a function of cognitive styles than of 
emotional needs. Perhaps one of the cognitive 
theories (Festinger, 1957; Heider, 1946) 
would be more isomorphic with a work con- 
text that emphasizes rational-intellective 
processes. 

With respect to the FIRO theory, the re- 
sults of this study suggest a limit of the the- 
ory's application to situations. Prior studies 
attest to its applicability to situations in 
which personal and interpersonal content is 
predominant (love, therapy, learning). In 
industry, the predominant content of interper- 
. sonal activity regards materials, costs, sched- 
ules, and less personal information. Such a 
context may not evoke interpersonal needs to 
the degree necessary to make the aim of their 
satisfaction a salient goal. 

Findings of the study also have implications 
for work organizations. It may be that strate- 
gies to match personalities in order to gain 
interpersonal effectiveness, though frequently 
attempted, are not a productive effort. 

The FIRO theory has the appeal of concep- 
tual simplicity and construct specificity. Its 
implications and potential values are broad 
enough to justify further examination, includ- 
ing more tests of work contexts and using, 
perhaps, instruments that are more effective 
for assessing interpersonal needs and that are 
capable of greater stability. Also, a more de- 
tailed investigation of compatibility by indi- 
vidual need might disclose that summing the 
need compatibilities (inclusion, control, and 
affection) produced igo gross à measure of 
compatibility. Tt is as important to delimit 
theory as it is to extend it. Perhaps this study 
has value in terms of establishing limits. 


REFERENCES 


Arcyris, C. Explorations in interpersonal compe- 
tence—I. Journal of Applied Behavioral Science, 
1965, 1, 58-83. 

Arnot, G. M. An investigation of the influence of 
interpersonal compatibility on counselor and coun- 
selee perceptions of the initial interviews. Unpub- 
lished doctoral dissertation, University of Roches- 
ter, 1968. 

Brox, W. R. Experiences in groups. New York: 
Basic Books, 1961. 

EisENTHAL, S. The dependence of visibility of values 
upon group compatibility and level of need for af- 
fection. Unpublished doctoral dissertation, Univer- 
sity of Kansas, 1961. 

Estapt, B. K. Interpersonal compatibility and in- 
tragroup choices. Unpublished doctoral disserta- 
tion, Catholic University of America, 1964. 

Frstincer, L. A theory of cognitive dissonance. 
Evanston, Ill: Row, Peterson, 1957. 

Gross, R. L. Therapy group composition: Personal- 
interpersonal variable. Unpublished doctoral dis- 
sertation, University of Utah, 1959. 

Hartsoucu, D. M. Effects of interpersonal orienta- 
tion and language similarity on verbal commu- 
nication in dyadic interpersonal relationships. Un- 
published doctoral dissertation, University of Flor- 
ida, 1964. 

Hemer, F. Attitudes and cognitive organization. 
Journal of Psychology, 1946, 21, 107-112. 

Hutcnerson, D. E. Relationships among teacher- 
pupil compatibility, social studies grades, and se- 
lected factors. Unpublished doctoral dissertation, 
University of California, Berkeley, 1963. 

KERCKHOFF, A. C, & Davis, K. E. Value consensus 
and need-complementarity in mate-selection. Amer- 
ican Sociological Review, 1962, 27, 295-30. 

Kramer, E. A contribution toward the validation 
of the FIRO-B questionnaire. Journal of Projec- 
tive Techniques and Personality Assessment, 1967 
31, 80-81. eb 

OsnApovic, S. M. Interpersonal factors in the su- 
DOEYISOr teacher relationship. Unpublished doctoral 
a University of California, Berkeley, 

Powers, J. R. Trainer orientation and group com- 
peanon in laboratory training. Unpublished doc- 
A issertation, Case Institute of Technology, 

se Ea pet interpersonal relationships 

1 ioning. Journal of Abnormal 
oci ^ 

4. dm Psychology, 1960, 60, 241-246. 
ed B beg oe sige the FIRO scale with a 
group having limited interpersonal contact. Jour- 
nal of the Hillside Hospital, 1964, 13, 95-99. 

Sarotsky, A. Relationship Between patient-doctor 
compatibility, mutual i d 
P im i prn and outcome of 
70, 70-16. of Abnormal Psychology, 1965, 

Scnurtz, W. C. FIRO-B. Palo Alto, Calif.: Consult- 

ing Psychologists Press, 1957. (a) 

Scuutz, W, C. FIRO-F. Palo Alto, Calif.: Con- 
sulting Psychologists Press, 1957. (b) 


Orat 


Scnutz, W. C. FIRO: A 
of interpersonal behavior. 
hart & Winston, 1938. 

Scuvzz, W. C. The interpersonal underworld. Palo 

... Alto, Calif.: Science and Behavior Books, 1966. 

T Scuvzz, W. C. T. he FIRO scales, manual. Palo Alto, 

.. Cali: Consulting Psychologists Press, 1967, 
Scuutz, W, C,& ALLEN, V. L. The effects of a T- 

group laboratory on in 


terpersonal behavior. Jour- 
= nal of Applied Behavioral Science, 1966, 2, 265- 
286. 


a 


three-dimensional theory 
New York: Holt, Rine- 


-' 


FM ver 


WILLIAM J. UNDERWOOD AND Larry J. KRAFFT 


SIEGEL, S, Nonparametric statistics for the behavioral 


sciences. New York: McGraw-Hill, 1956. 
Vopacex, J. A study of the relationship of FIRO-B 
measures of compatibility to teacher satisfaction 
and congruence of role expectations for the princi- 
pal. Unpublished doctoral dissertation, University 
of Wisconsin, 1961, 
Yarom, I. D., & Rano, K. Compatibility and cohe- 


siveness in therapy groups. Archives of General 
Psychiatry, 1966, 15, 267-275. 


(Received February 7, 1972) 


Journal of Applied Psychology 
1973, Vol. 58, No. 155-100 


CAUSAL CONNECTIONS AMONG MANAGERS’ MERIT 
PAY, JOB SATISFACTION, AND PERFORMANCE s 


CHARLES N. GREENE? 
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This study investigated the source and direction of causal influence in the 


relationships among merit pay, 
of-change-in-product-moment and 
utilized to analyze field 
over 


satisfaction, and performance. The frequency- 
cross-lagged 
study data obtained from a sample of 62 managers 
a 2-year time interval. The results indicated that merit pay caused satis- 


correlation techniques were 


faction and, further, that merit pay increased the correlation between these 


two variables. Satisfaction was found to be 


an effect and not a cause of 


performance. The influence of performance was to increase the correlation 


between performance and 
hypothesis was not supported, 
reciprocal causation. 


The relationship between job satisfaction 
and performance constitutes perhaps the most 
provocative area of study concerning behavior 
in industrial organizations. Four decades have 
elapsed since the initial investigation by Korn- 
hauser and Sharp (1932), yet interest in this 
subject by both the practitioner and re- 
searcher has grown. This growing interest has 
occurred in spite of Brayfield and Crockett’s 
(1955) conclusions that “there is little evi- 
dence in the available literature that employ- 
ee attitudes .. . bear any simple—or, for that 
matter, appreciable—relationship to perfor- 
mance on the job |p. 4221” " 

At least three theoretical propositions 
derlie the research and writing in this area. 
The first and most pervasive stems from the 
human relations movement and its emphasis 
on the well-being of the individual. According 
to this view, job satisfaction expressed by an 
individual determines his performance: that 
is, satisfaction causes performance. Ww hile this 
view has received popular support, it has lit- 
tle empirical basis. In his review of 23 corre- 
lational studies conducted on the subject, 
Vroom (1964) reported a statistically insig- 
nificant (r= 14) but consistent median cor- 
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satisfaction. 
but the results suggested the possibility of 


95 


The merit-pay-causes-performance 


relation. between employee satisfaction and 
various measures of performance. None of 
these studies was designed to test the causal- 
ity of the relationship. 

The second proposition, best represented by 
the work of Porter and Lawler (1968), con- 
siders satisfaction not as a cause, but as an 
effect of performance; that is, performance 
causes satisfaction, According to this view, 
differential performance determines rewards 
which, in turn, produce variance in satisfac- 
tion, Thus satisfaction is considered to be a 
function of performance-related rewards. At 
the empirical level, one study, conducted by 
Bowen and Siegel (1970), lends some support 
to this theoretical position. The moderate 
time-lag correlation coefficients they reported 
for the performance-causes-satisfaction condi- 
tion were significantly higher than the low 
correlations observed in the satisfaction- 
causes-performance condition. The rewards 
variable was not considered in this study. 

Closely related to the second view is a 
third in which both performance and satis- 
faction are considered to be a function of re- 
wards; that is, performance-contingent re- 
wards cause performance and rewards cause 
satisfaction, According to this view, formu- 
lated by C herrington, Reitz, and Scott (1971) 
from contributions of reinforcement theorists, 
there is no inherent relationship between sat- 
isfaction and performance. The results of their 
experimental investigation strongly supported 
their theoretical propositions. The rewarded 
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subjects, undergraduate business students, 
reported significantly greater satisfaction than 
did nonrewarded subjects. Furthermore, the 
subsequent performance of the appropriately 
reinforced subjects (rewarded high performers 
and nonrewarded low performers) was signifi- 
cantly higher than that of the inappropriately 
rewarded subjects (rewarded low performers 
and nonrewarded high performers). Signifi- 
cant correlations were observed between per- 
formance and satisfaction expressed in the 
subsequent period, but only in accord with the 
established reward-nonreward reinforcement 
schedules. The correlation between satisfac- 
tion and subsequent performance, excluding 
the effects of rewards, was .00; that is, satis- 
faction does not cause performance, 

In only one study (Cherrington et al., 
1971) have rewards, monetary bonuses in 
this case, been causally related to both per- 
formance and satisfaction and the perfor- 
mance-causes-satisfaction and satisfaction- 
causes-performance propositions been rigor- 
ously tested. While highly significant findings 
were observed, this particular investigation 
Was conducted in a laboratory setting with 
Students as subjects, and thus the question 
E it does with all laboratory experi- 

as to whether the adequacy of the 


simulation of the real world w 


3 f e 'arrants using 
the results with confidence when dealing with 


more applied problems, A logical alternative 
to the laboratory experiment is the field ex- 
periment, However, field experiments present 
à numbr of practical and ethical problems, 


which provide 
al bases of the 


h Cross-lagge 
anaysis and the frequency-of- 


change-in- 
uct-moment (FCP) technique hende 
provides a basis for inferring causality, | 
without the traditional manipulation "s ^ 
bles in a controlled environment, The e 
of the study reported he eg 
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(a) merit pay and job satisfaction, (5) merit 
pay and performance, and (c) job satisfac- 
tion and performance by using the two corre- 
lational-causal approaches to analyze data 
obtained in a field setting. 


METHOD 
Sample 


The sample consisted of 62 first-line managers rep- 
resenting both the line and staff functions of the 
marketing and financial divisions of a large manu- 
facturer of business and communications equipment. 
Each first-line manager was responsible for the activ- 
ities of at least four subordinates who reported 
directly to him. With the exception of the compen- 
sation information, the data were collected by mea 
of questionnaires. 


Measures 


Both the cross-lagged and the FCP techniques re- 
quire that identical measurements of merit pay, job 
satisfaction, and performance be taken at two points 
in time (Time 1 and Time 2) with the same subjects. 
The measures for each manager were obtained ap- 
proximately at the time of his annual salary-per- 
formance review in 1969 (Time 1) and in 1970 
(Time 2).* 

Announced salary increases and company salary 
records were sources of the measure of merit pay 
The merit pay measure for cach manager was the 
amount of his annual salary increase less cost of 
living adjustments, expressed as a percentage of his 
previous year’s annual salary. The range of annual 
merit increases for the total sample for both Time 
1 and Time 2 extended from 3% to 15%. 

Job satisfaction expressed by the first-line manager 
was measured by the scale developed by Bullock 
(1952). He reported a test-retest reliability coeffi- 
cient of .94 and split-half reliability coefficients of 
93, .90, and .90 were observed when the scale was 
administered to successive samples. In this 10-item 
scale, the first 9 items required the manager to re- 
spond to one of five alternatives ranging from highly 
satisfied to highly dissatisfied. The tenth item required 
the respondent to indicate on a summary scale his 
overall job satisfaction, Scores were computed by 
summing across the items, The maximum total score 
was 50 and the minimum 10. The range of scores 
reflected in the total sample extended from 23 to 49. 

The current performance of 
uated by two of his peers, Identical 7-point semantic 
differential scales were devised for the raters to in- 
dicate their evaluations of both the quality and 


the manager was eval- 


*A third time lag (1971) had been planned; how- 
ever, economic conditions in the industry which nega- 
lively affected the company’s financial situation sig- 
nificantly reduced the amount of funds available for 
merit increase purposes. As a result, the data Collected 
during 1971 were excluded from the study and t 
third time lag abandoned. f "s 
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quantity of the manager's performance. The responses 
were scaled 1 (extremely low quality or quantity of 
work) to 7 (extremely high quality or quantity of 
work). The analysis, however; revealed a high level 
of agreement between raters; the values of r on both 
performance dimensions were .58 and .73 for Time 
1 and .62 and .79 for Time 2. When the ratings on 
both performance dimensions were combined, the 
correlations between the raters were .71 for Time 1 
and .78 for Time 2 (all ps < .001). As a result, both 
sets of peer ratings were combined to provide one 
rating that was a more comprehensive measure of the 
manager’s performance. The mean averaged peer rat- 
ings for both time periods extended from 2.68 to 6.51. 


Analytical Procedures 


The cross-lagged panel correlation technique was 
presented initially by Simon (1954, 1957) ; noted later 
by Blalock (1962), Campbell (1963), Campbell and 
Stanley (1963), Pelz and Andrews (1964), and 
McGuire (1967), and applied most recently by Law- 
ler (1968) and Yee and Gage (1968). Underlying 
this technique, and specifically the time-interval re- 
quirement, is the assumption that if one variable (C) 
causes another variable (E) there will be a time lag 
between the two variables. For example, if per- 
formance causes satisfaction, the present (Time 1) 
state of performance should be more highly related 
to the future (Time 2) state of satisfaction than it 
is to either the present or the past state of satisfac- 
tion, Comparisons of the relative magnitudes of the 
correlations between performance and the three 
states of satisfaction thus provide a basis for evalu- 
ating the performance-causes-satisfaction and satis- 
iaction-causes-performance propositions. 

The most serious problem encountered with the 
cross-lagged correlation technique, noted by Yee and 
Gage (1968), is the limited number ot inferences 


which this analysis makes possible. If, for example, 
f Time 1 and 


the relationship between performance in rime : 
satisfaction in Time 2 (rverti3s:) is significantly 
greater than the relationship between Time 1 satis- 
faction and Time 2 performance (rasiwerts), it can be 
inferred that performance caused satisfaction. Con- 
versely, the finding that rssierts > F'rerturse indicates 
that satisfaction caused performance. The problem, 
identified initially by Rozelle (1965), arises from the 
fact that there are at least two additional jnferences 
which are possible. The first finding, frertus: 
prorerfos could result not only from (a) performance 
influence to increase the correlation 


having greater s 
between performance and satisfaction but also from 


(b) satisfaction having greater influence to decrease 
the correlation between the two variables. Similarly, 
the second finding. Fiswerts > l'rertass could be 
attributed to either (e performance, causing the 
correlation to decrease OF (d) satisfaction causing it 
to increase, In other words, there are at least four, 

increases B, A decreases 


i j xs dM 

ro rival hypotheses: 

ee os ferentes A and B decreases A. In summary, 
E ases A, à 


the major problem encountered with cross- d 
correlations is that it is not possible to distinguish 
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between the source and direction of influence of the 
two correlated variables, that is, to determine which 
variable had the greatest influence and whether it 
increased the correlation (positive effect) or de- 
creased the correlation (negative effect). 

The FCP technique was devised by Yee (1968; 
Yee & Gage, 1968) in part to overcome this weakness 
in the cross-lagged correlation technique. The FCP. 
technique requires that the data collected for each 
individual respondent be placed into one of four 
categories, Using the relationship between job satis- 
faction and performance as an example, the data for 
each manager were placed into the JS+, JS—, Perit, 
or Perf— category according to the following pro- 
cedures: 


213 The Time 1 and Time 2 raw scores for satis- 
faction and performance for each respondent were 
converted to standard scores. In other words, 2 = 
ec 3)/s were computed for each score. 

2. The direction of influence, positive or negative, 
was identified for each case by determining if the 
cross-product of the Time 2 z scores was greater or 
less than the cross-product of the Time 1 sz scores. 
If the cross-product of the Time 2 = scores, Zsssrerfa, 
was greater than Sssiverts the direction of influence 
was considered to be positive; that is, the interac- 
tion between satisfaction and performance increased 
the overall correlation, The opposite condition, when 
the cross-product of the Time 2 z scores was less, 
indicated a negative direction of influence; that is, 
the interaction between satisíaction and performance 
decreased the overall correlation. between satisfac- 
tion and performance. 

3. The source of influence was ascertained by ex- 
amining the cross-lagged z products for each case. 
Where the direction of influence was positive, the 
variable whose Time 1 measure was part of the 
larger cross-lagged 3 product was considered as the 
source of influence. On the other hand, if the direc- 
tion of influence was negative, the variable whose 
Time 1 measure was part of the smaller cross-lagged 
z product was considered as the source of influence 
In summary: 3 


li the direction of the change was positive (that 
erte > Zs verni) and if Sss;perts > £rerris then 
action was identified as the source of the: posi- 
tive in opm (denoted by JS+). Conversely, if 
Spertas: > S3syPerts, performa y " 

the positive Ade (nota n k 
. li the direction of the change was negative (that 
is, Susererts <sysirert:) and if zis;perr; > spertass, then 
satisfaction was the source of the negative influence 
(denoted by JS). However, if Spertygase > ZisiPertz 
performance is considered the source of the negative 
influence (denoted by Perí—). 


4. Aiter each of the cases was classified into one 
of the four categories (JS+, JS—, Perf+, or Perf—);, 
chi-square tests were computed to determine if the 
number of cases placed in a given category differed 
significantly from the number placed in the other 
three categories. Chi-square was computed using the 
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TABLE 1 


FREQUENCIES-OF-CHANGE-IN-PRODUCT-MOMENT RESULTS 


Frequencies | 


Values of chi-square 


Relationships 
A. Merit pay - job satisfaction MP+ MP— 
21 E 
i y — performa MP+ MP— Perf+ Perf— 
B. Merit pay — performance 5 $ pi 
C. Satisfaction — performance JS+ Js- 


J&t Js- 
4 


1 
| MP— = Perf+ 
| .02 E .05 
Perf+ Perf— | H3be JS+ 7 Perf-- — JS— # Perf— 
10 10 30 12 | Zir 9.02* 03 


Hib MP+=JS+  MP— = Js— 
9 |1274** 13.92** 1.04 
H2b^ MP+ = Perf4- 


11 


p Hypothesis 1b: (MP4) + (MP—) > [(J54) + gs 


b Hypoth b: (MP+) + (MP—) S (Pert) + (Pe ED 
; Hypothesis 2c: ((IS+) + ($—) < L(Pert 4.) + (Perf -)]. 

* p < 01. 

** p < 001, 


general formula and Yate 


s' correction for continuity 
(Guilford, 1965), 


RESULTS 


The empirical Propositions cited at the out- 
Set of this paper, particularly the findings re- 
ported by Cherrington et al, (1971) provide 
the basis for the hypotheses examined here. 
In review, Cherrington et al, found that re- 
wards (monetary bonuses) cause satisfaction, 
performance-contingent rewards cause satis- 


causes job Satisfaction ex. 
of the cross-lagged correla- 


[Pj ) or neg . 
M +(MP~ | 
2. Merit be 1 [OS «Qs, 
(2a) T MPiPert; > 
+(MP~)] >] 
3. Job satisfactio, 
mance; that is (3a) Migros a 
(3b) [QS 4 LA yfe S Frons, and 
Dar 1 ) + Qs] < [(Perfty 4. 
The cross-lagged 
testing the first hypoth 
relationships between 
be expected if merit 


Turis: = 4$ (p< 001) an 


efficients 


: € type of 
variables that would 


The Time 1 and Time 2 correlations also were 
of lesser magnitudes, rus, = .21 and raras, 
= .28 (p<.05). The results obtained by 
the FCP technique, presented in Table 1A, 
Support the correlational results. Merit pay, 
not satisfaction, was the primary source of 
influence (y? = 12.74, p< .001). Further- 
more, the direction of the influence of merit 
pay was to increase the correlation and it did 
so to a significantly greater degree than did 
satisfaction (MP+ = 21, JS+=7, cc 
13.92, p < .001), 

The FCP results concerning the relation- 
Ship between merit pay and performance do 
not offer evidence of causality. None of the 
FCP results, noted in Table 1B, even ap- 
proach significance. The correlation coeffi- 
cients also fail to support one causal direc- 
tion more than the other, The Time 1 and 
Time 2 correlations were .34 (p < 01) and 
37 (p « .01), respectively, The cross-lagged 
correlations, however, were Stronger, though 
not significantly more so, than the Time 1 
and Time 2 correlations and were of virtually 
equal magnitude: urirer, = 44 (p< .001) 
and /panyp, = 42 (P < .001). Given Cher- 
rington et als (1971) finding that perfor- 
mance-contingent rewards cause performance, 
the cross-lagged results do suggest the possi- 
bility of reciprocal causation. It is conceivable 
that pay caused performance and that man- 
agement, according to company policy, had 


related the subjects’ pay to their prior per- 
formance. 


* The data 
evaluation o. 
addition to 


also were analyzed with the superior's 
Í the manager's performance included in 
the. two peer ratings as the measure of 
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The data clearly support the hypothesis 
that satisfaction does not cause performance 
and further provide evidence supporting the 
opposite condition—that performance causes 
satisfaction. The cross-lagged correlations 
were rysipertg = .17 and rpernass = 49 (p< 
.001). The magnitudes of Time 1 and Time 2 
correlations were relatively low and compa- 
rable to the insignificant satisfaction-causes- 
performance cross-lagged correlation, zjs;Pert: 
= 20 and rygoperts = 23 (P < 05). The FCP 
results, presented in Table 1C, offer addi- 
tional support. These results, which reveal 
that performance was the primary source of 
influence (x? = 7.11, $ < .01), also show the 
significantly more positive effects that perfor- 
mance had on the relationship between per- 
formance and satisfaction (Perf-- = 30, JS 
= 10, x! = 9.02, p < 01). 


DISCUSSION AND CONCLUSIONS 


The results of this study support the hy- 
potheses that merit pay causes satisfaction 
but satisfaction does not cause performance. 
Concerning the satisfaction-performance rela- 
tionship, however, the results do provide evl- 


dence supporting the opposite condition—that 
es satisfaction. The FCI 


performance caus ; ; 
analysis further reveals that the influence o 
both merit pay and performance was to in- 

ween both of these 


cor ion bet 
crease the correlation be J | e 
variables and satisfaction. The merit: pay 
causes-satisfaction finding supports Cherring 


ton et al.’s (1971) experimental results m 
indicates, contrary to some current «q 
S, y r : 
(e.g Herzberg, 1966), that merit pay 15 
5r 


more frequent source of satisfaction than dis- 
y does 


satisfaction. However, when merit p ‘= 
constitute a source of dissatisfaction, an id 
sequent absenteeism and turnover, E. En 
than likely does 50 for the high p ed 
because of management's failure to effectively 


ts were con- 
reported here W h the excep- 
nd Time 2 correlations and 
auses-merit-pay correlation were 
ie These yesults were not reported 
ronger. o was a primary decision maker 
p ae sunt of the managers merit 
un nce rating constituted 


eríiorma nien 
of the performance measure. 


marginally st 
since the supe 


ining the 
i etermining r. 
de thus, his P 


increase and, 
aminatio 


a source cont 


relate pay to performance. While the high 
performer feels deprived compared to the low 
performer, the low performer is satisfied with 
his relative pay. 

Despite the lack of confirmation of the 
merit-pay-causes-performance hypothesis, the 
analysis of this relationship reveals correla- 
tions between merit pay and subsequent per- 
formance and between performance and sub- 
sequent merit pay which were moderately 
strong and of virtually equal magnitude. 
'These results, combined with Cherrington et 
al/s (1971) finding that performance-con- 
tingent rewards cause performance, raise the 
rather provocative question concerning recip- 
rocal causation. Merit pay may have caused 
performance since, as indicated by the data, 
the company was apparently moderately suc- 
cessful in implementing their policy of relat- 
ing employees’ pay to their prior performance. 
The company’s use of a fixed-interval pay 
schedule—as opposed to a variable-interval 
schedule in which performance could have 
been reinforced shortly after it had occurred 
—may have obscured part of the reinforcing 
effects that merit pay has on performance. 
While this type of inference is speculative, it 
does emphasize a largely unresolved problem 
with organizational pay schedules. For the 
researcher, it demonstrates the need for de- 
veloping more rigorous methods for examin- 
ing potential causal relationships with field 
study data. 

When combined with the evidence sup- 
porting the merit-pay-causes-satisfaction and 
the performance-causes-satisfaction hypothe- 
ses, the relatively strong correlation between 
performance and subsequent merit pay has an 
additional implication. These particular re- 
sults are consistent with Porter and Lawler’s 
(1968) predictions that differential perfor- 
mance causes rewards which, in turn, cause 
satisfaction. However, these predictions ap- 
plied to merit pay have practical value only 
to the extent that the manager can effectively 
differentiate among performance levels and, 
further, to the extent that he possesses both 
the monetary resources and the willingness to 
reinforce his subordinates on the basis of their 
performance. 
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A measure of a general readiness to accept change and a measure of attitudes 
toward change in a specific area (promotions policy) were correlated with age, 


education, 


managerial rank, frustration-contentment, level of self-confidence, 


extraversion, and neuroticism in a sample of 258 managers. Older managers 
were more conservative than the younger managers, and the confident more 
radical than the unconfident. Stable introverts and emotional extraverts tended 
to support innovation, while emotional introverts and stable extraverts did not. 
The relation between readiness to accept change and managerial status, educa- 
tion, and contentment depended on the type of change and was not always 


linear. Nevertheless, the results support the notion of a general readiness for 
change underlying attitudes toward change in a specific area. 


Kahn, Wolfe, Quinn, Snoek, and Rosenthal 
(1964) have argued that in almost any pro- 
cess of organizational change "there will be a 
new guard advocating innovation and an old 
guard urging the retention of the status quo 
[p. 128]." Who will side with whom? Com- 
monly, it is assumed that attitudes to change 
in management practices will be less radical 
among the old, the less trained, and managers 
of higher rank than among the young, the 
trained, and those who have yet to make it. 
The stereotype (mostly taken for granted but 
sometimes explored, e.g. Soddy, 1967) is of 
the educated and bustling young (whose tal- 
ents have yet to be tried) welcoming change, 
while their more settled seniors, graduates of 
the schools of experience; do not. M 

What little evidence there is for this view 
is based on measures of general attitudes to 
innovation. If confronted with the reality of 
change with, perhaps, high prices to pay tor 
open-mindedness, would the alleged conserva- 
tism of the older manager still show up 
strongly? Other work suggests (4) that there 
is a general attitude toward change which is 
positively (although weakly) related to desire 
for specific changes (Gruenfeld & Foltman, 
1967; Hardin, 1967; Trumbo, 1961) and (5) 
that those who have the power to initiate or 
veto change are, in general, more sympathetic 
to change than those on whom it may be im- 
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posed (Bass, 1960; Cartwright & Zander, 
1962). Clearly there are influential factors 
other than age and obstructiveness to be con- 
sidered. 

The purpose of this research was to com- 
pare general attitudes to change in mana- 
gerial practice, attitudes to a likely change in 
a specific area of managerial practice, and 
conditioning factors of these attitudes. The 
possible adoption of thorough-going appraisal 
schemes by companies that at the time did not 
operate them was the specific area of mana- 
gerial practice selected. Appraisal schemes 
were chosen for study for two reasons. First, 
they belong to the area of promotions policy 
where changes, being directly relevant to the 
careers of all, are likely to be of interest to 
everyone. Second, appraisal schemes have at 
least five features in common—objective selec- 
tion methods, regular written assessments, an 
emphasis on present ability rather than senior- 
ity, feedback on how one is being seen to 
perform, and job specification. As each of 
these features can be disputed or defended in 
turn, they can be used collectively to scale 
attitudes to appraisal schemes as a whole, 
with those completely in favor of or com- 
pletely opposed to such schemes placed at the 
ends of the scale and those with differing de- 
grees of reservation placed in between. 

The factors selected as relevant to attitudes 
to change were of three sorts. First, age, 
managerial status, and education were in- 
cluded because other studies have demon: 
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strated their connection with general resis- 
tance to, or acceptance of, change (Dalton, 
1969; Pym, 1965). Second, interviews in a 
pilot study clearly indicated that the level of 
confidence with which a man held down his 
job and the degree to which he felt his talents 
and ambition had been matched to his posi- 
tion in his firm would affect the favor or dis- 
favor with which he regarded changes in pro- 
motions policy. So, measures of confidence 
and of frustration-contentment were included, 

(The third set of measures were of introver- 
Sion-extraversion and  neuroticism. These 
variables have been shown to be sufficiently 
basic to touch upon an enormous range of 
behaviors, and theoretically, there is reason 
to believe they should also be related to 
attitudes to change (Eysenck, 1970). 

, The Trumbo scale, which contains a set of 
nine Likert-type items, was selected to 
measure attitudes to change in managerial 
practices in general, Developed for use in 
management studies, it is short, to the point, 
reliable, and shows the expected relations be- 
tween conservative-radical atti tudes to change 
and other variables (Hardin, 1967; Pym, 
1965; Trumbo, 1961). The shortened version 
of the Maudsley Personality Inventory (Ey- 
senck, 1958) was chosen to test introversion— 
extraversion and neuroticism, 

The literature contained, however, 


) no suit- 
able measure of attitudes to appraisal schemes 
nor short relevant tests of confidence and 


frustration-contentment, Consequently, the 
research fell into two parts: (a) the develop- 
ment of the required tests and (b) the study 
proper, in which the two scales measuring 
attitudes to change in managerial practice— 
one specific, the other Seneral—were exam- 
ined in relation to age, education, managerial 


status, contentment 
» confidence ici 
and extraversion, PATE 


METHOD 
The Preliminary 
the Scales 


Guided by 27 intensive in 
pany, a pilot study was c: 
questionnaire of 78 items—touching 0 

P n feature: 
(a) appraisal schemes, (b) responsibilities that mi » 
trouble a manazer in the course of his work p 
the relationship between a manager's position in Pe: 
" 1s 


Study: Development of 


: nterviews in one com- 
arried out, It comprised a 


firm and his view of his talents, and (d) competing 
styles and aspects of managerial life and practice— 
administered to 80 managers who were attending 
training courses (Sample 1). Each item was a state- 
ment with which the manager was asked to record 
his degree of agreement or disagreement on a 3-point 
scale. Sixteen items failed to discriminate. The re- 
maining 62 were intercorrelated, the principal com- 
ponents then extracted and rotated to simple struc- 
ture (varimax). Nineteen factors (with eigenvalues 
of one or more), accounting for 61% of the vari- 
ance, were extracted. Eight were readily interpret- 
able, and of these, three (Factors I, III, and IV) 
were directly relevant to the purpose of the study, 
since within cach were items expected to form part 
of the needed scales. 

Factor I (labeled conservative-radical and ac- 
counting for 10% of the variance) contained items 
which tapped an unwillingness to change, including 
most of the items on appraisal schemes. Factor III 
(labeled. content-discontent and accounting for 7% 
of the variance) had its highest Joadings on items 
which suggested a frustrated manager, overdue in 
his opinion for adequate recognition. Factor IV (la- 
beled confidence level and accounting for 6% of the 
variance) was defined by items reflecting the uncer- 
tainty of a manager in coping with his job. 

To cross-check the pilot study results, a new 
questionnaire was constructed. It was based on 29 
of the 31 items? that correlated at a highly signifi- 
cant level? (p< 001) with one (and only one) of 
the three relevant factors. Seven other items (which 
otherwise would have been excluded) were retained 
to help balance the ratio of positive to negative 
statements in the inventory, giving a total of 36, 
Twenty-seven of these were in the form of state- 
ments provided with five response categories— 
strongly agree, agree, undecided, disagree, and 
Strongly disagree, scored 5 to 1, respectively. The 
remaining nine items, which dealt with features of 
appraisal schemes, had similar response categories 
and scoring, but they differed in that they were 
presented in pairs of contrasting statements, one of 
which was to be selected for response, 

The questionnaire was then given to 358 managers 
(a 63% response rate) from eight companies that 
each had 1,000 employees or more, An analysis of 
the first 271 respondents from seven of the com. 
panies showed that the younger and better educated 
managers were slightly overrepresented, This im 
balance was deliberately redressed when collecting 
the remaining respondents. The sample reflects the 
distribution of age and education in the populátion 
from which it was drawn, but no claim is made for 
any wider representation. At the lime none of the 


companies was operating an Appraisal scheme, al- 
? Two items with high loadings on the conserva 
tive-radical factor did not deal with : 


features of 


appraisal schemes and were dropped as unwanted 
ante 


intruders. 
* Two-tailed tests of significance are 


used th 
out this report. rough- 
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CORRELATES OF MANAGERS’ ATTITUDES TOWARD CHANGE 103 
TABLE 1 
IMPORTANT LOADINGS ON THE THREE Facrors COMMON TO BOTH ANALYSES 
Factors and their defining items | Sample 1* | Sample 2 
Content—discontent (Factor III first analysis; Factor I second analysis) | 
I sometimes doubt whether my superiors really know what my strong points are 58 1 
Some good people in the department have been overlooked when promotions have i Je 
been made 
Promotion in this company too often depends on having the ear of the right people x id 
This company takes too much notice of paper qualifications when making Sitomito t 9 
I wish my superiors would tell me more about how I am doing P ES 32 
In a big company like this you feel lost in the crowd P 6t 
Confidence—lacking confidence (Factors IV first analysis; Factor II second analysis) i ic 
Sometimes I feel that my present job requires skills or training which I do not have —.59 
Personal worries sometimes interfere with my work i5 —69 
I feel I am working under too much pressure E 168 
I find it embarrassing to discuss a subordinate's weaknesses with him a 738 
Iam in close contact with my subordinates and know their strengths and weaknesses 22 OR 
Hours of work in this job are too long —37 48 
Appraisal scheme attitude" (Factor I first analysis; Factor III second analysis) Si E 
Against regular, formal written reports kept on file on all managerial staff by their 
superiors me . .66 m 
Against job specification, leaving managers largely free to make what they can of 
their jobs i 63 64 
Against regular progress discussions 61 ^57 
Immediate boss knows best who among his subordinates is ready for promotion; ; 
objective selection methods rarely better at picking a man for the job .50 55 
Rapid promotion of the young a threat to morale of long-service personnel .52 32 
an = 80, 
vn = 100. 
including the five defining this factor, differed in form and were much 


© The items relating to aspects of a sal scheme: 
wordier than the others in the questionnaire. Only abbre 


though in each the adoption of such a scheme was 
known to be under consideration. 
The managers were asked about their education, 


status, and age, and they completed the Trumbo scale 


and the short tests of extraversion-introversion and 


Using a table of random numbers, the 
and tests for 100 of the managers 
separated from the rest and used to 
The information on 
ntly 


neuroticism. 
questionnaires 
(Sample 2) were 
cross-validate the pilot study. 
the remaining 258 was analyzed quite independe: 
in the study proper (Sample 3). 

The seven balancing items together with the three 
that failed to discriminate were dropped. Using only 
Sample 2 data, the 26 remaining items were inter- 
correlated, factor-analyzed (principal components), 
and rotated to simple structure (varimax). Using 
the Kaiser-Guttman criterion, nine factors emerged, 
accounting for 66% of the total variance. 

The three factors of the pilot study reemerged 
with minor rearrangement. Content-discontent now 
appeared as Factor I (accounting for 19% of the 
ance) ; confidence level as Factor II (13%); and 
conservatism-radicalism as Factor III (956). This last 
was renamed assessment scheme attitude, since the 
five items which had the highest loadings on it 
dealt with features of appraisal Schemes. The re- 
maining factors were less readily interpretable. 


ated versions of these five items 


are given in the table, 


The items which had highly significant loadings 
(P < .001) in the analysis of both Sample 1 and 
Sample 2 questionnaires on at least one of the three 
factors—discontent, confidence, and assessment 
scheme attitude—are listed in Table 1. These item 
were used to produce three scaled scores—one RE 
responding to each factor. By reversing items whi ë 
applicable, so that all ran in the same directi ; the 
17 items (each with its internal scale of NPRN a 
five) were grouped according to the tacts “ike 
defined and then summed to produce the thre E s 
Because of the methods used, some ende Y 
i placed in the reliability of the factors on wA 
e ies Ls based. The correlation (r) between 
tiie fl ea Was .06, showing the unre- 
a sc 
lated from the ids E os Pen re pt vee 


The Main Study 


3 =o for the 258 managers in Sample 
Dicizal ati ra confidence, discontent, and ap- 
In addition SA € appraisal scheme [AS] scale). 
number of steps po status was measured by the 
the chief ree Sa individual was removed from 
NAER A in his company. A managing di- 
directly to Edad a score of 0, those who reported 

y n a score of 1, and so on. In all, there 


1U4 
TABLE 2 
DISTRIBUTION oF SAMPLE 3 ON Main VARIABLES 
| Meaning of 
Variable Range | M | sp bir mre 
Managerial status | 0-5» | 3.06 | 1.22 | Low status 
Vducation 0-58 | 2.03 | 1.56 | Highly educated 
Age |23-69 |43.80 | 9.80 | Older 
Discontent scale | 10-30 |21.32| 3.14 | Frustrated 
Confidence scale 8-30 | 19.78 | 3.36 | Confident 
0-6: | 3.84] 1.61 | Extraverted 
0-6 | L83| 1.63 | Emotionaily 
unstable 
Appraisal scheme 5-25" | 13.01 | 3.75 | Favorable attitude 
scale to appraisal 
schemes 
Trumbo scale 18-42 | 31,46 | 6.91 | Favorable attitude 
to change 
| 


^ Also corresponds to maximum possible range. 


were seven levels over which the sample was 
(roughly) normally distributed. Educational level 
was measured on a 6-point scale in terms oí the 
highest examination passed, with no examinations 
scored O through postgraduate degree scored 5. 
The ranges, means, standard deviations, and the 
meanings of high scores on these and the remaining 
four variables—age, the Trumbo scale, and the 


tests of extraversion and neuroticism—are given in 
"Table 2. 


RESULTS 


The two attitude scales (Trumbo and AS) 
were treated ‘as the dependent variables 
throughout the study, and correlations were 
calculated between these scales on the one 


hand and the remaining variables on the 
other (Table 3), 


Managerial Status, A ge, and Education 


There was no relation between the Trumbo 
scale (measuring a general attitude to change) 


TABLE 3 

Propucr-Momenr Correr, 
AND APPRAIS 

INDEP. 


ATIONS BETWEEN TRUMBO 
AL SCHEME SCALES AND 
ENDENT Vanianr; 


Trumbo scale | 


Variable Appraisal 
scheme scale 


Managerial status | —.08 
Education 24* Ecl 
Age ~ 36" | Do 
Discontent scale .09 | —39 
Confidence scale .28* »i dm 
Extraversion 40 E^ 
Neuroticism -08 x 
rumbo scale | 31* 
^p «M. im. 
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and managerial status, but between the AS 
scale and status a clear, significant relation 
held. Senior managers looked with favor on 
the characteristic features of appraisal 
schemes—though in the sampled companies 
not favorably enough to have introduced them 
earlier—while those in lower managerial posi- 
tions had their reservations. 

Age, on the other hand, showed a signifi 
cant relation. with both scales, and in each 
case it was the older groups who were the 
more conservative. There was, however, no 
relation between managerial status and age 
(r = 07, p> .10) and classification by age 
did not affect the relation between managerial 
status and the AS scores. 

The Trumbo scale correlated significantly 
with education, the AS scale did not. As age 
and education were negatively correlated in 
the sample (r = —.26, p < .01), correlations 
between the Trumbo scale and education were 
calculated for each of three age groups—those 
below 35, those from 35 to 49 years old, and 
those 50 or older. The correlation rose slightly 
for the young and middle-aged groups but dis- 
appeared altogether (r — .01) in the group 
who were above 49, 

There was also a significant correlation be- 
tween the Trumbo and the AS scales. 


Confidence, Discontent, Extraversion, and 
Neuroticism 


The correlation between the Trumbo and 
the confidence scales suggests that the confi- 
dent look with more favor on innovation than 
those lacking confidence. What held true for 
attitude to change in general, also held for 
attitude to change in promotions policy in 
particular. In fact, the largest of the correla- 
tions in Table 3 is between the AS and con- 
fidence scales, and it is the confident who 
regarded appraisal schemes with at least equa- 
nimity, while the less confident were query- 
ing. The straightforward, linear nature of the 
relation between these two variables is given 
in Table 4, where the sample has been divided 
into three groups of roughly equal size on the 
basis of each of the two scales. Sixty percent 
of those who belonged to that third of the 
sample who were least confident of their abil- 
ity to cope with their jobs held unfavorable 
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TABLE 4 


ATTITUDE TO APPRAISAL SCHEMES 
LEVEL OF CONFIDENCE 


ATTITUDES TOWARD CHANGE 


TABLE 5 


ATTITUDE TO APPRAISAL SCHEMES BY 
LEVEL OF DISCONTENT 


Confidence scale group 


Discontent scale group 


Appraisal scheme 


scale group 


scale'group, Con- Uncon- 
fident | AYE) fident Bl oum Average EM All 
For appraisal 56 21 12 30 GD For appraisal 41 13 39 30 (77 
Average 33 13 28 34 (88) 3 33 33 37 $ 
Against appraisal 11 36 60 36 (93) Against appraisal 26 54 24 36 (93) 
‘otal 100 (SS) | 100 (80) | 100 (90) | 100 (258) Total 100 (83) | 100 (96) 100 (79) 100 (258) 
Note. xt = 57, df — 4, p € 001. Ns are in parentheses; Note. x? = 26, df — 4, p <.001, Ns are in parentheses; 


all other numbers indicate percentage. 


attitudes toward appraisal schemes. These 54 
men were the managers who also showed un- 
willingness to accept change on the Trumbo 
scale, the men for whom change in general 
would seem to be threatening. Their mean 
Trumbo scale score was 27.41 compared with 
32.46 for the rest of the sample (t = 4.18, p 
< 01). 

There were no direct, significant correla- 
tions between the AS and Trumbo scales on 
the one hand and the discontent, neuroticism, 
and extraversion tests on the other, but tabu- 
lating the scores on each of the first pair 
against each of the remaining three tests did 
reveal relationships. A clear curvilinear rela- 
tionship between the AS scale and discontent 
is illustrated in Table 5. Over one half of the 
middle group on discontent looked unfavor- 
ably on the features of appraisal schemes, 
while over one third of both the contented 
and frustrated groups looked upon them with 
approval (one third in all categories being 
neutral). No comparable significant relation 
held between the Trumbo and discontent 
scales. 

There were no significant correlations be- 
tween the two attitude scales and either extra- 
version or neuroticism. However, Furneaux 
(1962) has shown that plotting the position 
of subjects with respect to both the extra- 
version and neuroticism scales of 10 uncovers 
relationships that treating these scales inde- 
pendently only conceals. Consequently, the 
sample was divided into four groups on the 

dians of the extraversion and 


basis of the me 

the neuroticism tests, and the mean AS and 
Trumbo scale scores for these groups were 
calculated (Table 6). Inspection of the dif- 


ferences suggested that among the managers 


all other numbers indicate percentage, 


it was the unstable introverts and the stable 
extraverts who held conservative attitudes to 
change, while the emotional extraverts and 
stable introverts tended to support it. A two- 
way analysis of variance was calculated for 
each set of means and highly significant inter- 
actions (but no other significant effects) 
were found (for the AS scale, F = 29.90, df 
=1, p<.01; for the Trumbo scale, F= 
33.37, dj — 1, p< 01). 

Perhaps elucidating the relation between 
the extraversion-neuroticism groups and the 
attitude to change scores was the further rela- 
tion between the groups and their scores on 
the discontent scale. It was the stable intro- 
verts (on average) who had the lowest discon- 
tent scores and the emotional extraverts who 
had the highest. It will be remembered that 
it was the extreme scorers (both the con- 
tented and the frustrated) who approved of 
the basic features of objective appraisal 
schemes. 


Discussion 
Although the area of proposed change 
(quite independent of its actual content) is 
TABLE 6 


m er md SCHEME (AS) SCALE AND TRUMBO 
SCALE SCORES FOR EXTRAVERSION-NEUROTICISM 


ROUPS 


Extraversion group 


Neuroticism 


group Below median | Above median 


AS | Trumbo | 
11.67 | 2933 | 14.29 | 
1401 | 33411 | 12.14 | 


AS | Trumbo 


Above median 
Below median 


33.86 
29.54 
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important in sorting advocates from opponents 
of innovation in managerial practice, the idea 
of a generalized attitude to change underlying 
more specific attitudes to innovation is sup- 
ported by this study. The general and the 
more specific tests (the Trumbo and AS scales, 
respectively) were themselves positively cor- 
related and, generally, had similar patterns of 
correlations with other variables. The results 
thus support the findings of others, notably 
Trumbo (1961), Gruenfeld and Foltman 
(1967), and Hardin (1967). 

The relationship between the general and 
the specific tests, however, was not so large 
as to rule out the importance of other vari- 
ables. On the contrary, the results clearly in- 
dicate that Status, age, confidence in one’s 
ability to cope, frustration with one’s lot, 
and introversion and neuroticism in combina- 
tion are additional explanatory factors. 

Where the two attitude scales did not show 
similar patterns of correlation with the inde- 
pendent variables, the explanation is usually 
obvious. Managerial status correlated sig- 
nificantly with the AS scale, even when age 
was controlled, though not with the Trumbo 
scale. Thus we interpret, as have Bass (1960) 
and Cartwright and Zander (1962). that it is 
not that the more senior mana 
younger (and therefore more ra 
explains their more innovative attitude toward 
what would have been a major change in 
managerial practice in their companies. It is 
nior the Manager (ir- 
the less he felt threat- 
The senior managers 
executants of an ap- 


nagers were 
dical) which 


ünagers 
€ lesser 
ange in gen- 
rrelate signif- 
ard features of a 


des than th 
educated toward work-related ch. 


eral, level of education did not co 
icantly with attitudes tow; 
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new promotions policy. Trumbo (1961) has 
argued that an adequate understanding of the 
phenomenology of industrial change requires 
an analysis of the felt needs of personnel in 
relation to the attitudes they hold toward 
change. Being confronted with a specific pro- 
posal for change, he suggests, is likely to 
sharpen men’s perceptions of what the change 
will mean for them. Thus, the educated may 
share a general ideology with respect to 
change, but not a common fate if change is 
introduced; that is, in evaluating specific pro- 
posals for change, self-interest may then divide 
them. 

The present results suggest that it is man- 
agers who have confidence in their ability to 
do their present jobs who are likely to wel- 
come novelty in any situation, However, the 
importance of this factor needs elaboration in 
further studies; it may be that it is the confi- 
dent who generate the general atmosphere that 
Lippett (1958) has described as conducive to 
change (progress). 

That stable introverts and emotional ex- 
traverts are more likely to hold favorable 
attitudes to change than their counterparts is 
further support for the significance of these 
primary traits. Eysenck (1964) has argued 
their relevance to criminality and (amusingly) 
the present study suggests their connection 
with management practice. 

The shortness of the confidence and discon- 
tent scales resulted from the stringent method 
of item selection used, The persistent emer- 
gence, stability, and homogeneity of the 
factors underlying these scales, together with 
the ready interpretability of results involving 
their use, augurs well for longer scales built 
upon these as cores. In addition, the con- 
fidence and discontent factors that emerged 
clearly in each of the preliminary studies re- 
semble closely the job related tension and the 
confidence-in-organization factors used by 
Kahn et al. (1964) in their analysis of the 
Stresses that go with different occupational 
roles. The duplication and usefulness of these 
factors in independent studies undertaken for 
different purposes argues their general rele- 


Vance in the study of management styles and 
practices, 
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THE THERAPEUTIC PERSONALITY IN THE 
THERAPEUTIC COMMUNITY: 


LEONARD V. GORDON ? 


State University oj New Vork at Albany 


This study was desig 
the rated ability of various types 


interact in a therapeutically beneficial 


personnel judged by peers to be the 


orientation, and to be aware 
handling patients, Few 
visory ratings, 


In the past decade the concept of the 
therapeutic community as applied to state 
hospitals and clinics has received growing 
attention and acceptance by professional 
mental health personnel (Matarazzo, 1971). 
The key principal of this concept is that all 
employees, volunteers, and other patients are 
potential rehabilitators, 
with the establishment o 
health training programs, the development of 
improved selection procedures for hospital 
Support personnel has become a matter of 
increasing importance, Through such pro- 
cedures, applicants whose personal charac- 
teristics are such as to render questionable 


their value as incidental therapeutic agents 
hopefully would be identified. 
However, research 


Accordingly, along 
f associated mental 


therapeutic communi 


a h nity has been notably 
lacking, This has been due to the general 
disinterest in this Population iv 


munity concept. Th 


1 This study w 
7 TI is Stiidy wa and Supported by 
New York State Department of Seale 


which was responsible for <i à 

Er catherine, ADDE SAA js qui angements and 
participating installations Whose ge 
tion made this study possible, d 

? Requests for reprints should 1 

V. Gordon, School of Education, State Uis, Leonard 
New York at Albany, 1400 Washinaton i DOE 
bany, New York 12222 "venue; Al 


S initiated 
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zned to identify personality characteristics associated with 
of mental hospital support personnel to 
manner with patients. Female support 


more effective were inclined to take their 
responsibilities seriously, to be emotionally stable, 


trust and confidence in other people, to be 


to be cautious, to have 
nonbureaucratic in their work 


of the therapeutically preferred procedures for 
significant relationships were íound with the super- 


sents a modest start toward remedying this 
situation. It was designed to identify per- 
sonality characteristics associated with the 
rated ability of hospital support personnel to 
interact with patients in a therapeutically 
beneficial manner, 


METHOD 
Procedure 


The original plan of the study involved the testing 
of all members of selected units at six state hos- 
pitals with a battery of person 


ality tests and one 
cognitive measure. Testing was to be followed by 
peer rating sessions and evaluations of all subjects 
by two or more unit supervisors, Subjects were to 
be paid for participating. 

The study deviated substantially from the pro- 
posed design, since, for various reasons, specifications 
were not met at the Cooperating institutions at the 
time of testing. For example. 


; in five of the institu- 
tions the release of entire units for 2 hours was un- 
feasible, 


and the scattered representation. from a 
number of units resulted in many unusable peer 
ratings. In four of the institutions, only single super- 
visors were familiar with most participants and in 
many instances they knew relatively few of them. 
At one installation, a number of function 
appeared as paid volunteers and went 
motions of responding, Despite these and related 
difficulties, sufficient usable test data 
were obtained from all 
meaningful analysis. 


Samples 


Three state hospitals in the Rocky Mountain area 
and three in Pennsylvania participated in the study. 
The two Western samples (A and B, x= 45 and 36) 
and the three eastern samples (D, E, and F, n — 45, 
58. and 73) that were ultimately used comprised 
experienced employees in a wide spectrum of sup- 
Port activities such as custodians, various types 
of aides, food service personnel, laundry Personnel, 


al illiterates 
through the 


and ratings 
installations to permit a 


— Y i - - * - 
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etc, The third western sample (C, # = 37) consisted 
of members of a psychological-aide training program. 
Supervisory personnel who provided the ratings were 
primarily registered nurses. While subjects at the 
western installations were mainly — non-Spanish- 
American whites, or Anglos, substantial proportions 
of blacks and whites participated at the eastern hos- 
pitals. At all but one of the installations the very 
large majority of subjects were females. 


Test Variables 


Two temperament tests were included in the pres- 
ent study, the Gordon Personal Profile (Gordon, 
1953) and the Gordon Personal Inventory (Gordon, 
1956). These brief forced-choice tests each measure 
four personality traits, which may be concisely 
described as follows: 

Gordon Personal Profile 

Ascendency—self assured, assertive 
Responsibility—persevering, determined, reliable 

Emotional stability—well-balanced, emotionally 

stable 

Sociability—gregarious, sociable 
Gordon Personal Inventory 

Cautiousness—cautious, deliberate, not impulsive 

Original thinking—mentally active, intellectually 

curious 

Personal relations—tolerant, patient, 

others 

Vigor—energetic, productive. 
was the Work Environment 
Preference Schedule (Gordon, 1968, 1970) and an 
experimental Mental Health Aptitude Test. The 
former measures the individual’s commitment to the 
set of values, attitudes, and behaviors that are 
characteristically fostered and rewarded in highly 
bureaucratic organizations. These include willing 
subordination to authority, rizid adherence to rules 
and regulations, belief in the infallibility of expert 
judgment, and strong institutional identification. The 
Mental Health Aptitude Test consisted of a set of 
problems that might occur with patients, with four 
alternate actions that might be taken in each, one of 
which was considered to be therapeutically prefera- 
ble. Three different Mental Health Aptitude Test 
forms of similar content but of unknown parallelism 
were used at each installation. 

Additionally, information such as age, years of 
experience in mental health settings, educational 
level, percentage of time the individual preferred. to 
be around patients, and degree of job satisfaction 
were obtained. 

Criteria, Subject 
on their ability to get 


specifically, 0n their a 
rario and comfortable rel 


important for the care an 


trustful of 


Also administered 


s were rated by peers and superiors 
along with patients and, more 
bility to enter into the sort of 
ationships that would be 
d treatment of patients.” 


3A second rating, of the subjects’ ability to work 
effectively with patients in a therapeutic sense, is 
not considered, being relevant only for the psycho- 


logical aides. 


luv 


One of two types of peer and supervisory ratings 
were obtained at each installation. Subjects either 
made high and low nominations from participant 
members of their unit (PN) or ranked all such 
members of their unit (PR). Supervisors ranked all 
subjects known to them and then rated each on a 
7-point scale with no inversions permitted (SR) or 
ranked all subjects using a forced distribution (SF). 


Data Analysis 


The first step in the data analysis involved sample 
purification. This consisted, first, of the identification 
and elimination of probable illiterates using the 
multiple criteria of chance scores on the Mental 
Health Aptitude Test, less than 6 years of education, 
and incorrect completion of the forced-choice tests. 
All subjects so identified were from one eastern 
installation (F). Second, subjects describing them- 
selves as blacks or Spanish-American (totaling less 
than one dozen) were eliminated from the three 
western samples to provide more culturally homo- 
geneous Anglo samples. Third, males who comprised 
unusably small samples at all but one installation 
were eliminated. Additionally, black and white sub- 
jects who were significantly represented numerically 
at the three eastern installations were identified for 
separate analysis. 

Next, reliabilities of the peer and supervisory rat- 
ings were estimated. A modified coefficient alpha 
(Gordon, 1969) was used for each peer-rating unit. 
Where two supervisory ratings were available for a 
sample, interrater correlational estimates were ob- 
tained. 

Supervisory ratings at the three eastern installa- 
tions were unusable. In very few instances did a 
subject have more than one rater, and for most 
raters the variances in the 7-point scale ratings were 
very small, with errors of leniency, stringency, or 
central tendency probable. At the three western 
installations, supervisory ratings were acceptable with 
interrater reliabilities (corrected) of .79, .51, anc 
.86, respectively. i 

Within cach of the three eastern installations, uni 
peer ratings varied widely in reliability. Units whic 
had reliabilities of .24 or below (which occurred f 
less than 20% of the groups) or fewer than fo 
meant de not included in the analysis. The pe! 
rating reliabilities of the am ^ 
M EA M E MUN samples were , 

At one western installation (A), unit sentat 
was unusably small. At the bii A guar 
the entire sample rather than unit rosters were 
takenly employed, yielding unusable peer rating: 
the third installation (C), which had an intact | 
ple of trainees, a reliability of 87 was obt: 
Thus, peer ratings were included for only thi 
western hospital. j 

Product-moment correlations of the variable 
with usable peer and/or supervisory ratings 
computed, Because of the small sample sizes, 2 
correlations for each set were then obtained ' 
transformations. Further, for the eastern s 
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TABLE 1 


CHARACTERISTICS OF SUBJECTS BY INSTALLATION AND BY RACE 


Installation | Race 

Characteristic - -— j= gene ae 

A B e | D | E | F | White | Black 

| | a 

Mean age 304 | 390 26.0 34.6 | 460 4L8 | 459 37.5 
Mean experience 3.2 7.6 E: $0 | $841 8.7 9.1 6.2 
Mean education 13.3 12.8 13.1 11.2 10.3 11.7 10.9 11.3 
Percent black 00.0 00.0 00.0 | 62.0 | 19.0 79.0 00.0 100.0 


average correlations were similarly obtained for black 


s É Table 2 presents concurrent validities 
and white subjects separately. 


against peer and/or supervisory ratings at the 
iw d P 1 " "n fnnc e 
RESULTS six installations, as well as reliabilities of th 


ratings. It may be noted that a greater per- 
Table 1 presents descriptive information centage of test variables have significant 


in the form of mean age, experience, and correlations with peer than with supervisory 
education for the several samples, as well as ratings, the respective values being 42 and 
_ the percentage of blacks at each installation. 26. These results are mirrored in Table 3, 
Considerable variability across samples on which presents average validities against the 
these characteristics is evident. peer and supervisory rating criteria. 


TABLE 2 


CORRELATIONS OF PERSONALITY AND OTHER VARIABLES WITH PEER OR SUPERVISORY 
RATINGS or EFFECTIVE INTERACTION WITH PATIENTS 


Sample 
Variable Western | Eastern 
| = "^ PSS e i Sa EL. 
= q E C e | 5 E F 
Personality E | E z 
Ascendency * = * j 
Responsibility E. -12 sa SUE m^ | ME Oe e 
Emotional "s 21 .35* 52 61 05 04 46 
E is stability me 720 29 “40 A3** | —.06 .33* 
4 crc a | dis | -45 1-15 27 | —04 ‘07 ; 
Original thinking 00 47 40* AT7** 48 AS 26* 
unl relations ES b E [1 E de da a 
igor -— a E t ES m . 
Work Environment hilos 46 07 .05 E 29* | —,30* 25* 
\ chedule 
Other variables —.33* 26 16 239 |—34 | —.22 | —31* * 
Age M Health Aptitude Test 13 o 09 21 
. : 3 E E 36* ;35€6 4 
re m A 3 20 Ai | —.07 —.04 —.09 
ducation — D 35 9 10 .09 ES —.01 
Time with patients ET —.06 .09 —.08 06 08 07 
Mas E: i p Ku 33* 03 dT —.25* 
: ‘ Ase n ‘09 —.04 25* 
Rating reliability = 36 37 37 45 58 73 
Type of rating SF ot -86 87 87 .68 EM 
ae mE SR SF PN PN PN PR 
Note. SF = supervisor forced ranking, SR — senec — Se ee 
ai $95. Supervisor rating, PN = peer high-low nominations, and PR = peer rating. 


THE THERAPEUTIC PERSONALITY 


Subjects who score higher on responsibility 
and cautiousness tend to be rated more highly 
by both peers and supervisors. Additionally, 
those who score higher on emotional stability, 
personal relations, vigor, and the Mental 
Health Aptitude Test and lower on the Work 
Environment Preference Schedule tend to be 
rated more highly by peers. Further, subjects 
expressing greater job satisfaction are in- 
clined to get higher ratings from both sets of 
raters. 

Table 3 also summarizes validity data sep- 
arately for blacks and whites at the same 
three eastern installations. Except for the 
measure of bureaucratic attitudes, which is 
significant for both, validities of the person- 
ality measures are largely significant for the 
blacks but not for the whites. 


DISCUSSION 


On the basis of peer judgments, female 
support personnel who are likely to be the 
more effective in the therapeutic commu- 
nity are inclined to take their responsibilities 
seriously, to be emotionally stable, to be 
cautious, to have trust and confidence in 
other people, and to be energetic. Despite 
being employed in a bureaucratic milieu, they 
tend not to be bureaucratic in their outlook. 
Additionally, such individuals appear to be 
more aware of the preferred procedure for 
handling problems that may arise with pa- 
tients and, incidentally, tend to be the better 
satisfied with their jobs. Neither age, length 
of experience, level of education, nor the per- 
centage of time that the employee prefers to 
work in the presence of patients appears to be 
the generally relevant variable in this regard. 

Supervisors, on the other hand, identify only 
those who tend to have a greater sense of 
responsibility and who are the more cautious 
as the more effective. The smaller proportion 
of significant validities with the supervisory 
ratings are not ascribable to differences 1n 
criterion reliability, since these are of a com- 
parable magnitude for both types of raters, 
but may have been due to supervisors being 
less familiar than peers with the interpersonal 
attributes of group members. The reliability 
of the supervisory judgments indicates that 
they were making their ratings on some 


THERAPEUTIC COMMUNITY 


IN THE 


TABLE 3 


AVERAGE VALIDITIES WITH PEER AND SUPERVISORY 
RatinG CRITERIA AND FOR Brack AND WHITE 
SUBJECTS AT THE SAME INSTALLATIONS. 


Ratings Race 
Peer _ | Supervisor! | Black White 
(n = 213) | Gr = 118) | Gr = 97) | (7 79) 
.09 | 06 .03 
230** EL 206 
03 E 
203 —01 
.18* —.04 
10 200 a8 .08 
A E ond .09 Age | —d1 
r 116* 109 5* = 
Work En- p i 3d 
vironment 
Preference 
Schedule —.10% .00 cape | —31* 
Mental = 
Health 
Aptitude 
T 08 28** 
| aT —.17 
| 10 3 
| 10 
02 
23* 


* p <.05. 
** 5 <01. 


common basis. If this were work effectiveness, 
the diversity of jobs within a given sample 
could well account for the low validities ob- 
tained, since the same personality character- 
istics would not necessarily be related to 
effectiveness in different jobs. 

That peers may be the more sensitive evalu- 
ators for the particular criterion employed 
is further suggested by the differences in the 
results obtained with peers and supervisory 
ratings at Installation C, the only place where 
usable ratings were obtained for all subjects 
from both sources and where supervisors and 
trainee subjects were in close and continual 
contact. While the rank ordering of validities 
of the personality characteristics is almost 
perfect (rho = .98), those for the peers are 
uniformly higher in magnitude. Since past 
research (Hollander, 1957; Smith, 1967) has 
shown peer ratings to have the substantially 
superior predictive validity, the present re- 
sults, based on this criterion, would seem to 
merit serious consideration. 

The differences in validity for black and 
white employees at the eastern hospitals are 
not ascribable to biased ratings, since blacks 
and whites achieved comparable average peer 
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ratings irrespective of the racial composition 
of the unit.* Further, means on the personal- 
ity variables were similar for both. A plausible 
case may be made for the obtained racial 
differences in validity actually being due to 
age differences. Not only were the blacks 
appreciably younger than the whites, but over- 
all, the lowest validities within both the east- 
ern and western sets of installations were ob- 
tained for the older samples and the highest 
for the younger. Thus age (and perhaps its 
concomitant, greater experience) may have 
had a significant moderating effect in the 
present study. Since applicant samples are 
likely to be younger and minimally experi- 
enced, the higher validities for the younger 
(and less experienced) employees would sup- 
port the inclusion of dimensions of the present 
type in future selection research. 

Matters not considered in the present re- 
search but meriting serious study (aside from 
the neglect of the male sex) would include 
the development of techniques for assessing 
quasi-literate applicants, since in particular 
localities this group comprises a significant 
proportion of the labor pool (illiterates on 


ee 
* Clique ratings, which are revealed by highly 
Coefficients (see Gordon, 


'roup composition also was noted 
earch (Gordon & Medland, 1965). 


the whole obtained satisfactory ratings in 
the present study); a determination of 
whether the validity of variables such as sex, 
race, cultural background, and age are moder- 
ated by the composition of the patient pop- 
ulation on these same variables; and an 
attempt to obtain evaluations from a highly 
relevant third-criterion group, patients. 
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A test of auditory selective attention previously validated against criteria of 
flight proficiency was related to the accident rate of professional bus drivers 
The test requires the listener to monitor a relevant message and ignore a con- 
current message presented to the other ear. A change in selective orientation 
is accompanied by 2 transient disruption of attention. À measure of proneness 
to this type of disruption was significantly related to accident rate. 


A test of auditory selective attention, 
which has been described elsewhere (Gopher 
& Kahneman, 1971), was validated against a 
criterion of accident frequency in bus drivers. 
The test consists of a series of 48 pairs of 
different messages presented simultaneously 
to the two ears, The items presented to each 
ear are digits and unconnected words, and the 
rate of presentation is two items per second to 
each ear. One of the two messages is desig- 
nated as relevant by a tone, and the task is to 
repeat immediately all digits in that message. 
Part 1 of the message lasts 8 seconds, during 
which either two or four target digits are pre- 
sented to the relevant ear. ‘A second tone is 
then presented to indicate which ear is rele- 
vant in Part 2 of the message. On 50% of 
the occasions, the same ear is relevant in both 
parts. Either immediately after the reorienta- 
tion tone or after the interpolation of one or 
two irrelevant items, three pairs of simultane- 
ous digits are successively presented to the 
two ears, and again, the task is to report the 
three digits which have been presented to the 
relevant ear. 

The earlier study (Gopher & Kahneman, 
1971) described the patterns of errors ob- 
served in the test and presented evidence of 


ur 

iThis study was supported. by the Egged Bus 
Company as part of their safety program. IL. Re 
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validity for some flight criteria. Briefly, there 
are relatively few errors of omission in the 
first part of the message and even fewer in- 
trusions of digits from the irrelevant ear, 
Part 2 of the message is much more difficult, 
however, and the most common type of error 
consists of a report of three digits, of which 
at least dhe is drawn from the irrelevant mes- 
sage. In a highly preselected group of cadets 
in the Israel Air Force, the number of errors 
in Part 2 of the message had a correlation of 
— 36 with a three-level criterion of profi- 
ciency in pilot training. The validity of omis- 
sion errors in Part 1 was significant, but 
lower. In another sample of 91 military pi- 
lots, the test discriminated significantly be- 
tween pilots selected to fly high-performance 
aircraft and those assigned to fly slower pro- 
peller craft or helicopters. Subsequent research 
has confirmed the validity of the test for the 
prediction of flight criteria. The predictive 
variance that it contributes is essentially inde- 
pendent of other cognitive and psychomotor 
tests that are currently in use in the Israel 
Air Force for the prediction of pilot aptitude. 

The pattern of results obtained in the first 
study led to the interpretation that Part 2 
of the test measures the speed and effective- 
ness with which attention is redirected to a 
relevant channel after an orientation cue. In 
particular, a reorientation from a prior state 
of attention to a channel is more difficult and 
more diagnostic than is the initial adoption of 
an orientation from an uncommitted waiting 
state. 
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The ability to reorient attention rapidly to 
relevant stimuli is obviously important to the 
driver on the road, although there seems to 
have been little research on the problem (see 
Häkkinen, 1958). The present study was de- 
signed to investigate the possible contribution 
of the selective attention test to the predic- 
tion of the accident record of professional 
drivers. The study was conducted in the Egged 
Company, a cooperative which operates inter- 
urban bus service throughout Israel and urban 
bus service in the cities of Jerusalem and 
Haifa. The company maintains a record of 
accidents for each driver in which each re- 
ported accident is rated for the severity of 
the driver's error. The total of these ratings 
for the year 1968 was used to select the sam- 
ple. 


METHOD 
Subjects 


Company policy restricts hiring of new drivers to 
the age range of from 22 to 32; therefore, the rele- 
vant population for the study was defined as all driv- 
ers in the company in that age range. During the 
period surveyed, 78 of the 1,087 drivers in this popu- 
lation (7.2%) had a total accident rating of 3.5 or 
more, indicating at least two moderately severe acci- 
dents, Of these, 39 were selected for the study largely 
on the grounds of accessibility for testing. For each 
repeated-accident driver, two other drivers were 
obtained, both of whom approximately matched him 
on the variables of age, number of years of experi- 
€nce, type of route (urban or interurban), marital 
status, and ethnic origin, One of the additional driv- 
ers had a zero-accident rate (obtained by 49% of 
the drivers in the company during the period sur- 
veyed), and the other had an accident rating of from 
E to 3.0 (obtained by 44% of the drivers). The test 
Was introduced as an exploratory study of attention, 


TABLE 1 


Means, STANDARD DEVIATIONS, 


AND INTE! 
RELATIONS AMONG ATTENTION Sco TY 
CORRELATIONS WITH 3-PoixT 
3 ACCIDENT CRITERION 
| I mE 
| Acci- | 
item x " dent Omis- | Intru- 
cri- | sions 1 | sions 1 
terion 
Omissions 1 | 14.95 | 12.31 -29 | 
Intrusions1| 6.99] 711| ^3 49 | 
Errors2 | 10.56! TA7| 37 | 49 | 51 
Noles We i. 5 
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TABLE 2 


ESTIMATED EFFECTS OF A REJECTION RULE 


— 


` > Total 
Sample | cept 
= 
Accident-free drivers pe 
Immediate group | 78 
Accident-prone drivers | a 
Total 122 | 1,087 


| 913 52 | 


and the drivers were told that individual results 
would not be reported to the company. Testing WS 
conducted during working hours at the local bus d 
tion as part of the driver's regular schedule of activi- 
ties, 


Procedure 

'The test has been described in detail by Gopher 
and Kahneman (1971). The messages were presente 
by earphones, and the subjects were requested d 
repeat all relevant digits as soon as they heard ad 
The duration of the selective attention test, ings 
ing the instructions and two practice messages, W"* 
25 minutes. In addition to the attention test, ? 
subjects completed a brief form of Raven's (1958) 
Progressive Matrices test. The experimenter was un 
aware of the subjects’ accident ratings at the time ° 
testing, 


RESULTS 


The study was conducted in the fall of g^ 
At its completion, accident records for et 
sample in 1969 became available, and they 
were used to assess the reliability of the cri- 
terion, The association between the. accident 
scores for 2 successive years was highly sig- 
nificant (x7 = 12.75, dj = 4, p< 01), indi- 
cating substantial criterion reliability (see 
Hiikkinen, 1958; Shaw & Sichel, 1971). 

The following scores were derived from the 
selective attention test for each subject: (a) 
the total number of omissions in Part 1 of 
the message, (b) the total number of intru- 
sions from the irrelevant ear in Part 1 of the 
message, and (c) the number of incorrect 
reports in Part 2 of the message (all types of 
errors included). The means and standar 
deviations of these scores are presented at 
Table 1, which also includes the product 
moment correlations among the variables A 
the study. All correlations in Table 1 are si” 
nificant beyond the .01 level (N = 117). 

The scores obtained on the selective atte? 
tion test were clearly related to the accident 


wy 
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criterion. For practical purposes the most use- 
ful score is the number of errors in Part 2 of 
the test, and little is gained by considering 
the other two error scores. The multiple R 
using all three scores is only .40. A similar 
conclusion was reached in the study of the 
attention test in military fight training 
(Gopher & Kahneman, 1971), although per- 
formance in that sample was far superior to 
that in the present group (e.g., for 100 flight 
cadets, the mean number of errors in Part 2 
was 3.1). Moreover, the validity of the selec- 
tive attention test was not due to differences 
in intelligence; the short intelligence test that 
was administered did not discriminate signifi- 
cantly between the criterion groups, and its 
correlation with the attention test was low 
(.33 with errors in Part 2). 

Some subjects showed an extremely high 
frequency of all three types of errors and also 
failed to improve with practice. These subjects 
typically adopted stereotyped patterns of er- 
rors (e.g., a tendency to ignore the reorienta- 
tion tone between Part 1 and Part 2). Al- 
though these cases have been included in the 
computations reported in Table 1, it appears 
reasonable to doubt the validity of the test 
for individuals who were unable to adapt even 
minimally to its requirements. Accordingly, 13 
subjects who made a total of more than 50 
errors of all kinds in the 24 messages pre- 
sented during the second half of the session 
were removed from the sample. This change 
resulted in an improvement of validity among 
the remaining 104 cases (r = 46 for errors in 
Part 2), as well as a lowering of the intercor- 
relations among the attention scores. When 
the two categories of relatively safe drivers 
were combined and compared to the unsafe 
group, the point-biserial correlation between 
accident frequency and the number of errors 


in Part 2 was .51. 
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The market conditions in Israel suggest 
that the most useful application of the selec- 
tive attention test would be as an aid to the 
rejection of a relatively small group of appli- 
cants who are most likely to be accident 
prone. An attempt was therefore made to esti- 
mate the effects of a selection cutoff at a score 
of 16 errors in Part 2, and a rule was made 
that repetitive stereotyped errors invalidated 
the test. We counted the number of cases that 
would have been accepted, rejected, or consid- 
ered invalid in each of the three samples in 
the study (accident free, accident prone, and 
intermediate). The effects of the rules on the 
entire driver population were then estimated 
by extrapolation from sample results. The 
estimates shown in Table 2 must be viewed 
with caution, since the present study was post- 
dictive rather than predictive, and the cutoff 
point was chosen to fit the present data. How- 
ever, these estimates suggest that, at a rela- 
tively negligible cost in the rejection of poten- 
tially safe drivers, the use of the selective at- 
tention test as an aid in decisions about hiring 
could lead to a reduction of perhaps 15%- 
2596 in the number of accident-prone drivers 
accepted. A study of predictive validity will 
now be conducted for at least 1 year. During 
this time, all applicants will be tested, but the 
test results will not be used for hiring deci- 
sions. 
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‘ystematically investigated as to 
y and discrimination. The data 
,124 introductory psychology students taking a regularly 

i i "as critically affected 
by all three independent variables with no interactions and that item discrimi- 
nation was affected by the use of an inclusive alternative with à significant 


interaction between orientation and stem structure, Some suggestions were made 


with respect to effective item-writing strategies. 


As a person sits poised to write a typical number of corr 
achievement or classroom test, he may well be clusive alternat 
troubled by three points: (a) Ttem-writing 


ect responses, the use of in- 
ives, the positive or negative 
orientation of the stem, and the order of items 
with regard to difficulty, Rules of thumb exist 
for each of these Structural properties, and for 
spicuously ab- some of them only meager research is avail- 
; able.* 
(c) The best laid test plan A four-choice format is most commonly ad- 
defective items, whick i by Structurally ^ vocated (Adkins, 1947; Engelhart, 1947) 
Er attenuate Boe ich can capriciously inflate because fewer choices would increase the error 
he specific oiii. ti ; variance and thus decrease test reliability 
br determining hj cot Interest here js that (Mattsen, 1965), Further, in practice it is 
multiple-choice ^ ge quite difficult to construct à larger number of 
Properties of di formats and the item plausible and effective distracters, In thi 
Admitted] = i iculty and discrimination, respect, Wakefield (1958) reported that 16% 
‘ledly, test validity and reliability are of 3,752 four-choice items functioned as such, 
a final test Product, whereas only 3% of 3,294 five-choice items 
t crucially depends were functional. Adkins (1958), however, sug- 
m and test Construc- gested that Wakefield’s findings are unreli- 
able, since his criterion for item effectiveness 
is, most Often, is questionable, Thus, this issue remains 


Boise con a stem or lead, followed by one equivocal. 
ec answers and a n H x 
umber k The rules seem to fay E or- 
tracters (For possibl d of dis 0 favor a closed-stem f. 


» mat, based on the premise that the examinee's 
Structura] Characteristic. ome of the specific confusion will be minimized if the central 
i € effective shi bec problem IS intact in the stem and based on 

number of D e-choice item the notion that the closed stem will be less 
the ambiguous (Ebel, 1951: Wesman, 1971). 
Dunn and Goldstein (1959) specifically in- 
SEC EAM ut. 


uld be Sent to A *TÉ is distressin, to note that th. most pertinent 
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x s Sychology 3 chapter, “Writ ?» i 1 edi- 
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tion of Educational Measurement is largely a repeat 
Division °F the 1951 edition and that it still deplores the Jack 
ase, Texas, of definitive and significant research on problems of 
Xas, item writing, 
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vestigated this problem, but were unable to 
show any systematic effects due to type of 
stem. 

Authorities are in disagreement as to the 
effect of inclusive all or none of the above” 
alternatives (Wesman, 1971). Some (Ebel, 
1951; Engelhart, 1947; Hughes & Trimble, 
1965) suggest it to be useful for increasing 
item difficulty when answers are not approxi- 
mations, while others (Adkins, 1947; Person- 
nel Research Laboratory, 1963) suggest it to 
be detrimental to test validity. In general, the 
research on this problem is inconclusive. 

With respect to positive or negative stem 
orientation, most writers seem to agree that 
attention should be drawn to the negative 
character and that negative items are more 
difficult because the respondee is required to 
shift mental set. 

Finally, the order of items within a test 
might have bearing on the difficulty and dis- 
crimination indices of items and thus merits 
control in research. Most authorities (Ad- 
kins, 1947; Nunnally, 1967; Personnel Re- 
search Laboratory; 1963) suggest that items 
should be ordered on difficulty, beginning 
with the easiest, and if distinct subareas are 
represented, a spiral-omnibus arrangement 15 
preferred. 


]n this study; character- 


specific 


three structural 
istics—stem format, inclusive versus 
distracters, and stem orientation—were se- 
lected for experimental manipulation, while 
mber of alternatives, the number of 


the nur n 
and the order of items were 


correct answers, l d 
experimentally controlled. The primary de 


variables were item difficulty (pro- 


pendent d t 
portion of respondees answering items cor- 
rectly) and item discrimination (point-bi- 


serial) coefficients. In addition, the effect of 
instructional set on test performance was ex- 
amined. 


METHOD 


Experimental Design 

2X2X2X8X2 
design where the 
imental variables were defined as fol- 


was à 


The 
fixed-effects, 
primary experi 
lows: 
<j-stem format 


Factor A: a1 items in close 
i „stem format 


ay items In open 
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Factor B: bi item stems with positive orientation 
b; item stems with negative orientation 


Factor C: c; items with four specific alternatives 
c» items with one of the four alter- 
natives inclusive. 


Two additional variables were included. One was 
a partial control factor (D) ior potential sequence 
or position effects that might exist among the eight 
formats resulting from Factors A, B, and C. This was 
accomplished by having each item format appear 
only once in a sequence of eight items and once in 
each of the eight possible positions, thus creating 
eight distinct sequences of the eight item formats. 
Eight test forms (each consisting of 64 items) were 
then constructed in which each of the eight distinct 
sequences of the eight item formats were ordered. 
The like-numbered items within the eight test 
forms evaluated the same knowledge element, but 
appeared in a different format on the different test 
forms. 

The second additional variable was instructional 
set (Factor E), where the e: and es test forms car- 
ried the instruction to select the "correct response" 
and the “most correct response,” respectively. Thus, 
a total of 16 test instruments Was created. Neither 
Factor D nor E, however, was to be included in the 
final analyses with the three primary variables un- 
less demonstrated to be significant with a priori 
analyses on the total test scores (number correct). 


Experimental Test Instruments 


A basic pool of 96 items was developed which cov- 
ered the material presented in an introductory psy- 
chology course. Each item was constructed in the 
closed-stem, positive orientation, four specific alter- 
natives format. Two 50-item pilot instruments were 
developed (with four common items), and each was 
administered to 58 sophomore-level psychology stu- 
dents, the majority of whom had recently completed 
the introductory psychology course. Item analyses 
were performed on the pilot tests, yielding means 
(2449 and 2407), standard deviations (3.87 and 
4.71), and item difficulty and discrimination indices. 
Sixty-four of the pilot items were selected for in- 
clusion in the experimental test instruments based on 
two criteria: (a) the evaluation and revision of items 
by the experimenters based on the item analyses and 
(b) course instructor evaluation with regard to 
content. In addition to insuring adequate test. reli- 
ability and content validity of the final experimental 
test instruments, the pilot tests provided the means 
to control item distracter replacement across the item 
formats. j 

The discrimination indices for the three jncorrec! 
alternatives for each of the 64 selected items wer" 
ranked from highest positive to highest negative 
W hen an item format required a substitution of ai 
original distracter, the one to be replaced was ran 
domly chosen on the basis of its ranked index, whic 
insured that both functionally effective and ineffec 
tive distracters were replaced in comparable numbe: 


Lnmb sib." 


J 


CN 
Oe 


Banen- 


o 


11d 


TABLE 1 
AN EXAMPLE of THE EIGHT [pry Formats 
FOR A SINGLE KNOWLEDGE ELEMENT 


Format 


bici (C,P,S) 
Which one of the following terms represents a measure 
of central tendency? 


a. variance 
b. standard deviation 


2ibic, (C,P,I) 


c. mode 
d. dispersion 


Which one of the following terms represents 
of central tendency? 
à. variance 

b. standard deviation 


aibyey (C,N,S) 


à measure 


C. mode 
- None of the above 


€. median 
- mode 


bic, (C,N,I) 
Which one of the following terms does not represent a 
Measure of central tendency? 
n c. mode 
d. none of the above 
aibic, (O,P,s) 
A measure of central tendency is the 
a. variance €. mode 
- Standard deviation d. dispersion 


üsbic, (0,P,1) 


A measure of central te 
a. the Variance 
b. the Standard deviation 


asbec, (0,N,S) 


ndency js 
€. the mode 
d. none of the above 


sitive Orien- 


O'S opii ri Š 

(C Den stem. P 

Orientation; S = fe distract, T, 
"ter, 


7 Specific 


and reduced th 
Finally, 


54 Possibility o: 
when an Inclusive distra, 
or an Original response, it 


Systematic bias 
Was Substituted 

Ong Yécame the COrrect re- 
4 Ordinarily, one should 

distracters When construc: 

instruments, 


replace the Most į i 
t ne j 
uds Neffective 


"experimental test 
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sponse for 
dom. 

Each test question was developed in the format of 
each of the eight possible combinations of Factors A, 
B, and G as illustrated by the example regarding 
centra] tendency in Table 1. Ag items had four 
alternatives With one Correct response, 


16 of the 64 test items selected at ran- 


Subjects 

The entire Semester enrollment (1,124 students) of 
an introductory Psychology Course was used as the 
subject population, The experimental test instru- 
ments served as their final course exam, This pro- 
Vided a normal test environment, insured subject 
motivation, lessened the likelihood of demand char- 
acteristics, and Provided a sufficiently large sample 
size for variance stability, Subjects were randomly 
assigned to the 16 test forms, Ninety minutes were 
allowed. for test completion, 


RESULTS 


Since like-numbered items on each test 
form evaluated the Same knowledge elements 
but in different formats, the effect of Factor 
D was evaluated through a simple one-way 
analysis of variance (ANOVA) of the total 
s test form means and standard 
33.60 to 35.65 and 
6.69 to 8.55, respectively, based on ns of from 
138 to 144, The analysis indicated that the 
test forms could be regarded as equivalent 
(F< 1.0, df= 7/1,16, p> 25). 

The effect of instructional Set, also evalu- 
ated in terms of total test scores, was not 
Significant via a £ test for independent groups 
(t = .77, df = 1,122, p> 05; & = 34.59, m, 
= 564; @ = 3423, m= 560), 


Item Difficulties 


A 2X2x2 fixed-effects, Tepeated-mea- 
sures ANOVA was performed on the item dif- 
ficulty values, 


1/63, p< 05), negative items more difficult 
than Positive (F = 14.71, df = 1/63, p< 
), and inclusive alternative items more 


difficult than all specific items (F = 81.54, df 
Mean difficulty indices 
for the formats are presented in Table 2. The 
i interactions (all Fs < 

$ suggests that the experi- 
mental factors could be considered Separately 


x 


EFFECTS or Item FORMAT oN Item DISCRIMINATION AND DIFFICULTY 
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TABLE 2 
Mean DIFFICULTY AND MEAN DISCRIMINATION INDICES 


Item 


Format 
Closed-positive-specific 
Closed-positive-inclusive 
Closed-negative-specific 
Closed-negative-inclusive 
Open-positive-specific 
Open-positive-inclusive 
Open-negative-specific 
Open-negative-inclusive 

Main effect 


(aibici) 
(aibics) 
(aibsci) 
(aibocs) 
(asbici) 


Closed stem (a1) 
Open stem (a2) 
Positive orientation (by) 
Negative orientation (ba) 
Specific alternatives (ci) 
Inclusive alternative (cs) 


A X B interaction 
Closed-positive 
Closed-negative 
Open-positive 
Open-negative 


Mean difficulty Mean discrimination 

61 .28 
EZ .26 
54 27 
46 22 
61 27 
54 24 
E .28 
AS 25 
54 .26 
53 26 
58 26 
50 .26 
58 2 
50 .24 
wey 

as 

.26 

.26 


Note, Decimals are rounded to two places. 


in item writing. The range of difficulty for 
each of the 64 knowledge elements across the 
eight formats was determined, yielding a 
mean range of .31 and minimum and maxi- 
mum ranges of .13 and .65, respectively (in 
the latter case 90% of the subjects were 
correct with one format, while only 2596 were 
correct with another format). 


Item Discriminations 


An identical ANOVA performed on the item 
discrimination values indicated that only the 
manipulation of inclusive versus specific al- 
ternatives systematically affected discrimina- 
tion values (F = 17.19, df = 1/63, $ < .001). 
Inclusive alternatives significantly decreased 
the discriminative ability of an item. While 
neither the stem format nor the item orienta- 
tion produced a significant main effect (both 
Fs < 1.0, p> .10), these variables did inter- 
act (F = 4.43, dj = 1/63, p < .05) to affect 
discriminations differentially, with 
open-negative items being 
criminating than closed- 
sitive. The interaction 
d in Table 2. No other 
(all ps > .10). 


item 
closed-positive and 
slightly more dis 
negative and open-po 
means are also containe 
interactions were significant 


No external criterion was available with 
which to gauge the overall validity of the 
tests. However, an examination of the table 
of point biserials for all knowledge elements 
in all formats (512 element-format combina- 
tions) is somewhat helpful regarding the 
goodness of the tests. Three hundred and 
eleven point biserials reached or exceeded an 
arbitrarily chosen value of .30. Twenty knowl- 
edge elements were discriminating (ie. r, 
> .30) in five or more formats, eight in Be. 
of the formats, and three in all eight formats. 


DISCUSSION AND CONCLUSIONS 


The purpose of this research was to collect 
empirical data regarding selected rules of 
thumb for multiple-choice item constructions 
that have been blindly perpetuated. In gen- 
eral, the results were supportive of the rules 
pie ag inier ramifications as to the 
relations bety i i i i 
EN item difficulties and item 

LUE respect to item difficulties, the results 
oi this study were in accord with the most 
popularly stated rules, Open-stem items were 
more difficult than closed, negative items more 
difficult than positive, and items with inclu- 
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sive options were more difficult than those 
with all specific options. An interesting re- 
sult was that none of the format properties 
interacted. Therefore, one can reasonably con- 
clude that an increase or decrease in item 
difficulty produced by altering one format 
property is unlikely to be nullified due to its 

unique combination with another property. 
The results of this research do not explain 
why one format property yields more difficult 
items than another. However, some practices, 
which once were supported by intuition alone, 
now have empirical and experimental evidence 
favoring them. On the other hand, at least 
several often-cited explanations seem to have 
been negated. For example, the open-stem 
item is often said to be more difficult than a 
comparable grammatically complete item stem 
because comprehension of the central prob- 
lem is more assured in the latter. However, 
exceptional care was taken to insure equal 
descriptive information and detail in both 
stem types. Therefore, the intuitive explana- 
tion appears inadequate for the results of 
this study. Negative items were more difficult 
than positive, The most frequent explanation 
is that the negative item is a departure from 
expectation (ie., most items within a given 
test are typically positive) and requires a 
Shift of mental set, which test takers fail to 
do, In the present study no set for positiv 
negative orientation should hav 
since each test form contained a 
ber of positive and negative iten 
the shifting of mental set expl 
to have been effectively eliminated for the 
present study, An alternative explanation — 
E bue study—is that humans are 
onditioned to, and practiced in, 


a positive orientation toward 
a roblem solv- 
ìng, and thus the shift th h " 


n at would be necessi- 
t sae a 
^ ao from a Positive set outside the testing 
v ment to à sometimes negative set 
within the testing situation 
There would appear t 
an item with an inclusive alternative requires 
more from the responde qum 
cific alternatives, N 


ues. Not only must the res 
know what is corre 


e or 
e developed, 
n equal num- 
ns. Therefore, 
anation seems 


ame time he 
other words, 
lge is being 
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tapped, which would appear to lessen s 
value of simple recognition or incidental 
knowledge. 

These results clearly suggest that the test 
writer should not be nonchalant in achieve- 
ment item construction. The format of the 
items had pronounced effects on their diffi- 
culties and thus could have dire consequences 
on a test used for grading purposes without 
normative data. This would be especially true 
for the classroom teacher who uses various 
item formats and who assigns grades on @ 
percentage correct basis. The possibility of in- 
creasing or decreasing the proportion correct 
by as much as the .65 reported above seems 
incredible, especially in light of the fact Me, 
the change would be artifactual and unrelate! 
to mastery of the specific knowledge element. 

The discriminating ability of an item was 
less susceptible to format changes. The use of 
an inclusive alternative was deleterious to dis- 
criminations, and the combinations of oper- 
positive and closed-negative stems reduce 
item discriminations over their counterparts. 
No direct evidence was obtained which would 
explain why an item with an inclusive alterna- 
tive was less discriminating. 

The lack of differential test performance aR 
a function of printed instructions, that 15, 
choose the most correct versus the correct 
response, deserves a passing comment n 
experimental manipulation was most likely 
rather weak, in that no verbal attention could 
be called to the instructions. Thus, the pres- 
ent data certainly cannot be interpreted neces- 
sarily as a denial of difference between the 
two instructions. 

It seems clear that the structural format of 
an item is indeed capable of capriciously al- 
tering either item difficulty, item discrimina- 
tion, or both, and thus most likely has some 
unknown effect on test validity and reliability. 
Two tentative conclusions about item-writing 
tactics for tests of the present kind seem 
appropriate. First, if items are developed that 
possess undesirable difficulties, the writer may 
safely alter either the item stem or its orienta- 
tion (but not both) without adversely affect- 
ing its discriminatory power. Second, the use 
of inclusive alternatives should be avoided. 


EFFECTS or [TEM FonMAT ON ITEM. DISCRIMINATION AND DIFFICULTY 


A final word about the experimental design. 
No suitable control was available to guard 
against the possibility that the reworking of 
the base format items into other formats 
might subtly alter the knowledge element be- 
ing questioned. Factor analytic procedures 
were employed as a partial check on the di- 
mensionality of the items, which suggest, not 
without some question, that the factor struc- 
tures of the knowledge elements did not 
change from format to format. Nonetheless, 
future research should encompass a designed 
control for this possibility. 
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SHORT NOTES 


EFFECTS OF VARYING PERFORMANCE-PAY INSTRUMENTALITIES 
ON THE RELATIONSHIP BETWEEN PERFORMANCE AND 
SATISFACTION: 


A TEST OF THE LAWLER AND PORTER MODEL 


ROBERT D. PRITCHARD : 


Purdue U niversity 


The Lawler and Porter model of job motivation predicts that a high-instru- 
mentality pay system should result in positive relationships between perfor- 
mance and satisfaction with pay, and that a low-instrumentality system 

correlations, The p: 
and measures of p 
taken, The Tesults indicated that perfor 


job Satisfaction — job perfor- pay instrumentality in a 
mance literature (Brayfield & Crockett, 1955; 


work simulation design that closely approximates 
/room, 1964) have indicated that there is no actual hourly and Diece-rate work on a realistic 


Simple, direct relationship between these two task. Furthermore, two simulations are reported, 
approach to this Problem is one of fairly long duration (3 days) and another 
d Porter (1967) model of short duration (13 hours), 
and its subsequent refinement (Porter & Lawler, 
at where there is 4 Srupv I 
high instrumentality) Method 
: between performance. The data from this study constitute a reanaly- 
and satisfacti ationale for this predic. SİS of data reported by Pritchard, Dunnette, and 
à reward system, high perfor. Jorgenson (1972). The relevant data came from 
t, in turn result 106 male college students who Were recruited by 
Tog performance results i advertisement and _ Were hired for what they 
ab int Ralf: fe cates thought was a real job. These subjects wi 
at, urn, Tesult in OW satisfac- on either an hourly 
à situation, a positive correla- i 


: i . Subjects in each conditi york 
instrumentality) ifferences in performance are ‘Shed. Subjects in dition worked 4 hours 
not followed by ¢ 


, : s er day for 3 consecutive da S. 
orresponding differences in re- P! á 
wards and Satisfactj 


5 ical task was shi 
ction. Thus, no relationship A clerical task was used Which el 


1 ation in quality of Performance 
: rmance and satisfac- 4 - ; it of a nee it was 
Bon. and sati impassite to M one unit of the task unti] it 
was done correct DN. 5 

There 1S some evidence to support these pre- d ihe taste on a ES Complete description 
dictions with Managers (Porter & Lawler, 1968), (1972) andin Pritchard et al 
hospital workers (Schneider & Olson, 1970) and = 
college students (Cherrington, Reitz, & Scott: Measures of Performance Consisted of the ac 
1971). The research reported here attempts tc tual number of task y its Completed, Job satis 
elaborate on these findings in that it tests the faction measures were taken at the end of each 
predictions from the model by directly manipu- day. Two measures were used, the Minnesota 

= : Satisfaction Questionnaire (MSQ; eis 

* Requests for reprints should be sent 4 1 England, & Loft uist, 

D. Pritchard, Department of et qs ipti 1 


Sychology, p scription Index ( DI; 
University, Lafayette, Indiana 47907 ove Parse : i 
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Results 


To test the predictions, correlations for each 
condition were computed between performance 
and satisfaction with pay for each of the 3 days. 
For the MSQ pay item under the hourly condi- 
tion, correlations were —.15, .21, and —.06 for 
Days 1, 2, and 3, respectively. Corresponding 
correlations with the JDI were .08, .33, and .36. 
Under the incentive pay condition, the correla- 
tions for the MSQ were .21, .29, and .28; while 
the JDI correlations were .36, .29, and .29.? The 
results offer some support to the hypothesis for 
the MSQ item related to pay. The median cor- 
relation for the 3 days under the incentive pay 
system is .28, while the median is —.06 under the 
hourly pay condition. Positive correlations were 
found for the JDI pay scale under the incentive 
pay condition (Mdn = .29) and they were also 
found under the hourly pay system (Mdn = .33). 
Thus, the MSQ offers some support for the hy- 
pothesis, while the JDI does not. 


Srupy II 
Method 


In the second study subjects were also hired 
for what they thought was a real job, which was 
known to be of approximately 2 hours duration. 
The subjects consisted of male and female high 
school and college age subjects (IN = 60) who 
answered advertisements in the local newspaper 
and the campus paper. Subjects were placed into 
one of four separate conditions, two methods of 
payment by two amounts of pay. The two meth- 
ods of payment consisted of hourly pay and 
piece-rate payment. Within each of these. condi- 
tions subjects were placed in either a high-pay 
or a low-pay condition. For subjects in the hourly 
pay condition, this consisted of $1.75 per hour or 
$2.50 per hour. The piece rate (7€ for the low 
pay and 10¢ for the high pay) was set so that if 
a subject in the low-pay-piece-rate condition 
performed at the same level as the average low- 
pay-hourly subject, he would earn $1.75 per hour. 
Similarly, the high-pay piece rate was set so that 
performing at the level of an average hourly-pay- 
high-pay subject would result in $2.50 per hour 
for the high-pay-piece-rate subject. : 

The task was identical to that used in Study I. 


2 ng under the hourly pay condition were 58, 56, 
and 58 for Days 1, 2, and 3, respectively. Corre- 
sponding ss for the incentive condition were 48, 45, 


and 43. Correlations of 29 or grea E rec 
at the .08 level, while the correlation of .36 is signifi 


cant at the .01 level. 


Results 


For the MSQ pay item, correlations under the 
hourly condition between satisfaction and per- 
formance were —.20 for the low-pay group (n = 
15), —.25 for the high-pay group (n = 15), and 
—.15 for both groups combined (n = 30). For the 
JDI, respective correlations were —.15, —.27, and 
—.14. The sample sizes were identical. Under the 
piece-rate condition, MSQ correlations were —.45 
for the low-pay group (7 = 15), .01 for the high- 
pay group (z:— 135), and —.28 for both groups 
combined (;z-— 30). Respective correlations for 
the JDI were —.35, .01, and —.20. Sample sizes 
were again identical. 

These results do not support the hypothesis. 
While none of the correlations are significant (p 
< .05), they are generally negative. Furthermore, 
when the results from all subjects in the hourly 
pay condition (Mdn r= —.143) are compared 
with all subjects in the piece-rate condition (Mdn 
r = —.24), the direction of the difference is op- 
posite that predicted. 


DISCUSSION 


When taken together, the results from the two 
studies do not support the hypothesis that per- 
formance and satisfaction will be positively re- 
lated in a high-instrumentality reward system and 
unrelated in a low-instrumentality reward sys- 
tem. Before we can accept this conclusion, how- 
ever, several issues must be considered. 

One might argue that the hourly pay system in 
the first study did not allow for a true test of 
the hypothesis, in that all subjects in that condi- 
tion received the same level of reward (pay), and 
according to the model, this lack of variability 
in pay should have resulted in lack of variability 
in satisfaction with pay. This restriction in 
range would result in the predicted low correla- 
tion between performance and pay. However, the 
second study addressed itself to exactly this ques- 
tion. Since two levels of pay were used, variabil- 
ity did exist when the analysis was made with 
subjects from both pay levels combined. How- 
ever, the correlations were still essentially zero. 
Thus, this restriction in range problem could be 
ruled out. 

A second issue deals with the perceived level of 
equitable rewards component mentioned in the 
model. This variable is said to effect the transla- 
tion of level of rewards to satisfaction with re- 
wards, Thus, if the perceived level of equitable 
pay varied in the subjects, the results become 
difficult to interpret. To test for this possibility, 
subjects in the second study were asked how 
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fair they felt their pay was. It was assumed that 
subjects who reported the pay was too low had a 
higher level of perceived equitable rewards than 
subjects who reported that they felt the pay was 
fair. Responses to this item were partialed out of 
the relationship between performance and satis- 
faction. This analysis led to exactly the same con- 
clusions as the original analysis. 

Two specific problems with the results should 
be mentioned. The first is that the second study 
was of very short duration, 11 hours. It may be 
that in so short a time subjects do not acquire a 
sufficient exposure to the job to meaningfully 
report feelings of satisfaction with pay. However, 
the first study was of longer duration (3 days, 4 
hours per day) and the hypothesis was still not 
clearly supported. Second, the Cherrington et al. 
(1971) study was also of short duration (2 hours) 
and their data supported the predictions. 

A second potential problem is the use of scales 
such as the JDI and MSQ for measures of satis- 
faction in this sort of research. The MSQ measure 
was only one item, and the JDI was primarily 
developed for blue-collar workers in full-time 
jobs. It is possible that more refined measures of 
Pay satisfaction which deal with satisfaction on 
part-time, temporary work would be more ap- 
propriate, For example, it is difficult to assess 
How bles vad repond t JDT py toi 

| as "barely live on Income" and "in. 
come provides luxuries" for Such a job, 
ne curious finding was that under the piece- 


rate condition in Study II, data from the low. 


bay subjects showed negative Correlations (Mdn 
= —40) betw 


: ! een performance and Pay satisfac- 
lion, while data from the high-pay subjects 


= .01). While none 


i in the low-pay conditi 
n pay condition had 
EAS money, and since they were work- 
th ^ not earning large amounts of money 
3 Were dissatisfied, imilar Subjects in the 

1E'-D3y group may have been More satisfied due 


earnings, However, this is pure 


speculation at this point. 
The model deals wit 


" h i 
the instrumentalit pes 


lypes of pay System, a 
mentality manipulatio; 
that subjects perceived 
"chances in 
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would result in making more money” (F = 30,45, 
df = 1/36, p < .0001). Clearly, a differential re- 
lationship between performance and pay was es- 
tablished. 

However, the relationship between rewards and | 
satisfaction was not clearly evident, especially for 
the piece-rate subjects. Level of pay was simply 
not positively related to level of satisfaction with 
pay. Clearly, the translation of level of reward, 
or at least level of pay, into satisfaction is more 
complex than the model implies. 

The question still remains, however, of why 
these results do not support the model, while the 
research cited previously does tend to support it. 
It is interesting to note that the strength of the 
relationships reported in this literature is quite 
variable. For example, Schneider and Olson 
(1970) found correlations of 24 and —.03 for 
their high- and low-performance-pay-instrumen- 
tality groups, respectively, while Cherrington et 
al. (1971) reported analogous correlations of .67 
and .03. Admittedly, the Cherrington et al. study 
was conducted in the laboratory while the 
Schneider and Olson study was a field survey, 
yet differences this large are surprising. Further- 
more, in the two studies presented in this paper, 
there was a great deal of variability in the find- 
ings, even with the same task. Correlations be- 
tween performance and satisfaction were largely 
positive in the first study, but predominantly 
negative in the second. 

This variability in findings would imply that 
the relationships in question are more complex 
than the model implies, Specifically, the trans- 
lation of level of reward into satisfaction with 
that reward is probably influenced by many other 
variables. This conclusion is not particularly sur- 
prising and has been noted in some detail in 
recent work by Lawler (1971), Dealing exclu- 
sively with the outcome of pay—the job outcome 
dealt with in the present study—he discusses how 
such variables as perceived job characteristics, 
perceived nonmonetary outcomes, and wage his- 
tory can influence satisfaction with pay, in addi- + 
tion to actual level of pay. 


The results of the present study clearly imply 
that the simple conceptualization of satisfaction 
as being a function of level of pay is inadequate 
and that future research must consider such de- 


terminants of satisfaction as those proposed by 
Lawler. 
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EGO-DEFENSIVENESS AS A DETERMINANT OF REPORTED 
DIFFERENCES IN SOURCES OF JOB SATISFACTION 
AND JOB DISSATISFACTION 


TOBY D. WALL 1 
University of Sheffield, England 
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manner predictable fr 


Vroom's suggestion th _ Herz 
fensive processes within individuals. 


The central hypothesis of the two-factor theory 
(Herzberg, Mausner, & Snyderman, 1959) is 
that the perceived determinants of job satisfac- 
tion are qualitatively different from the perceived 
determinants of job dissatisfaction. A distinction 
is drawn between motivators and hygiene factors 
—motivators being job content variables (such as 
achievement or responsibility) and hygiene fac- 
tors being job-context variables (such as company 
policy and administration and work conditions). 
The theory proposes that motivators are more 
important than hygiene factors as determinants of 
job satisfaction, and hygiene factors are more 
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enty-seven male employees described the determinants of their present 
and dissatisfactions during three work periods and completed a 
ne Social-Desirability Scale. It was found, especially in relation 
terminants of job dissatisfaction, that the higher the indi- 
ability score the greater was his tendency to respond in a 
om Herzberg's two-factor theory. These results support 
at Herzberg's results are in part a product of ego-de- 


important than motivators as determinants of 
job dissatisfaction. 

The two-factor theory has been subjected to a 
large amount of criticism, much of which has 
focused on its methodology (see, e.g., Soliman 
1970). Of special importance, because of its 1 
vance to job attitude research in general, is the 
suggestion that Herzberg's original results are a 
function of interviewees responding in an ego- 
defensive manner. Vroom (1964) put this argu- 
ment succinctly: 


It is . . . possible that obtained differences between 
stated sources of satisfaction and dissatisfaction stem 
from defensive processes within the individual re- 
spondent. Persons may be more likely to attribute 
the causes of satisfaction to their own achievements 
and accomplishments on the job. On the other hand, 
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they may be more likely to attribute their dissatis- 
faction not to personal inadequacies or deficiencies, 
but to factors in the work environment [p. 129]. 


This study investigates Vroom’s hypothesis 
that Herzberg’s results may be attributed to 
defensive processes within individuals. It is pre- 
dicted (a) that individuals high on ego-defensive- 
ness are more likely than individuals low on ego- 
defensiveness to attribute their satisfying experi- 
ences at work to motivators rather than hygiene 
factors and (5) that individuals high on ego- 
defensiveness are more likely than individuals 
low on ego-defensiveness to attribute their dis- 
satisfying experiences at work to hygiene factors 
rather than to motivators, 


METHOD 


Seventy-seven white-collar and blue-collar em- 
ployees, representing a 30% randomly selected 
sample of the male work force of a chemical 
company, were individually interviewed. Each 
subject described the d 


ment situation. Every interviewee also com- 
pleted a Marlowe-Crowne Social-Desirability 
Scale (Crowne & Marlowe, 1964). This scale is 
designed to measure the extent to which an indi- 
vidual avoids attributing sociall 

titudes or behayi 
regarded as Slveness. The 
Social-desirabil administered at the 
read to ensure that the inter- 
\ ce Madvertently guj subj 
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B rice ay gps len 
1 1 Le author ang a research assista t E 
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factors elicited (94%) 4 


Ygiene (960 
cases where there was disagreement tre n those 
assistant’s decision was final : research 


To enable statistical test 
be made, three measures, 
Herzberg scores (see Wall, 
more, 1971), were develop, 


3 of the hypothesis 
ly S t 
Which we shall call 


»& Skid- 


Herzberg Score (SH score) represents the extent 
to which an individual, in relation to satisfac- 
tion alone, responded in the manner predicted by 
the two-factor theory; that is, the extent to 
which motivators rather than hygiene factors 
were seen as the determinants of his satisfaction, 
For example, Subject X gave three factors as 
reasons for his satisfaction, two of which were 
motivators and one a hygiene. The two-factor 
theory being supported in two out of three in- 
stances, his SH score was .67. The Dissatisfaction 
Herzberg Score (DH score) represents the degree 
to which an individual, in relation to dissatisfac- 
tion alone, responded in the manner predicted by 
the two-factor theory; in other words, it is ob- 
tained by dividing the number of hygiene fac- 
tors perceived as determinants of dissatisfaction 
by the total number of factors mentioned, The 
Overall Herzberg Score (OH score) is a measure 
of the degree to which, in relation to satisfaction 
and dissatisfaction combined, the individual re- 
sponded in a manner compatible with the predic- 
tions of the two-factor theory. The three Herz- 
berg scores were computed separately for each 
of the three work periods considered, 
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The product-moment correlation coefficients 
for the relationships between the Herzberg 
Scores and the social-desirability scores, and the 
means and standard deviations of the Herzberg 
Scores for each of the three job periods consid- 
ered are presented in Table 1. The social-desira- 
bility scores had a mean of 17.49 and a standard 
deviation of 5.98. A clear pattern is evident. At 
no job period do the SH scores demonstrate the 
predicted relationship with the social-desirability 
scores. In contrast, the DH Scores show the 
predicted relationship in all three instances, and 
the OH scores confirm the hypothesis at two of 
the three job periods. In this study, however, 
social desirability was found to be Statistically 
significantly related to social class and age. Blue- 
collar employees obtained higher social-desirabi]. 
ity scores than white-collar employees (F = 14.9. 
b <.001), and older respondents obtained higher 
scores on the social-desirability scale than their 


younger counterparts (r = 32, p < 01). Further 
analysis of the data, 


techniques to hold social class and age constant 


> 1966), yields the same 


Table 2), significant results 


ie other finding is of interest. The s 
TE scores show no statistically 
erences (except the O : 
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TABLE 1 


SOCIAL-DESIRABILITY SCORES AT THREE PERIODS AND THE MEANS AND STANDARD 


DEVIATIONS 


OF THE HERZBERG SCORES 


ENT CORRELATION COEFFICIENTS FOR THE RELATIONSHIP BETWEEN HERZBERG SCORES AND 


Job 

£ Statistic Present During last year Previous 

Y 

El SH DH OH SH | DH | OH SH DH OH 
r —.08 37 AS 00 37 28 .20 55 5 
m 76 74 76 62 73 74 64 Ui t 
b ns <.0025 ns ns <.0025 <.01 ns <.0 
x 3 | .79 67 71 78 75 | .73 À aT 
SD 32 35 23 39 35 21 40 35 29 


action Herzl 
iid not describe both satisfying 


the numbers vary. 


job and job during last year, where p <.05 on a 
correlated ¢ test) between different job periods. 
'This negates the suggestion made by Hinrichs 
and Mischkind (1967) that longer recall periods 


promote results more compatible with the two- 


factor theory. 
DISCUSSION 


The hypothesis of the present study, derived 
from Vroom’s suggestion that Herzberg’s origi- 
nal results are an artifact of ego-defensive pro- 
cesses within individual respondents, was sup- 
ported in its entirety, by means of the OH Scores, 
at two of the three job periods. These findings 
are compatible with those of Wall et al. (1971) 
and Schneider and Locke (1971). However, when 
satisfaction and dissatisfaction at work are exam- 
ined separately, an interesting fact becomes ap- 
parent: The higher the individual's social-desira- 


and OH = Overall Herzberg S 
a a zberg Score. 
5 in each of the three periods; consequently, 


tendency to at- 


bility score, the greater is his 
hygiene factors 


tribute his dissatisfaction to 
rather than to motivators; but individuals with 
higher social-desirability scores do not show a 
stronger tendency, compared to individuals with 
lower social-desirability scores, to attribute their 
satisfaction to motivator rather than hygiene 
factors. This result holds for all three job pe- 
riods considered. 

How might one account for this divided sup- 
port for the hypothesis of this study? In rela- 
tion to job dissatisfaction, Vroom's interpreta- 
tion is plausible—when describing dissatisfying 
experiences at work, the individual with high ego- 
defensive tendencies will avoid attributing these 
experiences to motivators (such as lack of ad- 
vancement or lack of achievement) and so avoids 
possible implications of personal failure. With 
respect to satisfaction, however, it is less clear as 
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TABLE 2 


on ANALYSIS: THE RELATIONSHIP OF HERZBERG SCORES WITH 
s, SOCIAL CLASS, AND AGE PaRTIALED OUT 


SociAL-DESIRABILITY SCORE: 
Job 
Statistic Present | During last year "— 
on | su | 
SH DH | DH OH i ne = 
n —.21 2.51 87 39 2.23 1.89 96 434 3.66 
a 16 74 76 62 7 | 74 a ES S 
p ns <.01 ns ns <.025 | <.05 Ps p^ No 


Mu EE 


use of dummy | 
SH = Satis! 


Tote. Subjects were classi 
Note. Subjects B ae ee 
faction Herzberg 


& : 
Score, DH. — Di: 


to what constitutes an ego-defensive response, 
Indeed, in this situation, the need for ego-defense 
1 is difficult to discern and, Consequently, may not 
i manifest itself in the individual's response, He 
1 may feel that whether he 
| ing experience at work to 
giene factors, it will imply 
E concept. 
1 The interpretation of the lack of relationship 
i between the perceived determinants of job satis- 
! faction and social-desirability remains a moot 
- point. Nevertheless, the positive relationship be- 
tween social-desirability and individuals’ tenden- 
h cies to attribute their dissatisfaction at work to 
| hygiene factors rather than to motivators can 


only cast doubt upon the validity of the two- 
factor theory, 


attributes his satisfy- 
motivators or to hy- 
no threat to his self- 
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ance measurement and 
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four independent Variables 


accounted 
dependent variables, 


Implica- 


are Systematically related 
to performance, Though the utilization of pay as 


receiving increasin attention j i 
regard (Lawler, 1971) prea has 
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Haire, Ghiselli, and Gordon (1967) by simply 
examining the jntercorrelations among annual 
pay over a number of years. Since neither study 
utilized independent measures of performance, 
however, almost any interpretation of the corre- 
lations is possible, depending upon the assump- 
tions one chooses to make about the stability of 
performance and the consistency of its reward 
(Opsahl & Dunnette, 1966). 

Lawler (1966) obtained low correlations. be- 
tween performance and salary levels for mana- 
gers in four private and three public organiza- 
tions. These correlations, however, indicate very 
little about the nature of performance-pay rela- 
tionships. Rewarding performance with pay pri- 
marily requires that pay changes be linked to 
performance during a given time period. Low per- 
formance-pay-level correlations could still occur 
because of the possible differential effects of non- 
performance forces on pay levels. In addition, 
the more unstable performance is from reward 
period to reward period, the greater the changes 
in managers’ relative pay that are necessary to 
maintain a high correlation between performance 
and pay levels. 

The evidence to date thus indicates very little 
about the nature of managerial performance-pay 
relationships. The present study is designed to 
investigate these relationships more thoroughly. 
Specifically, it examines the impact of perfor- 
mance on managerial pay Jevels and changes. 
controlling for job level, length of service, and 
education. For previously mentioned reasons, it 
was expected that performance would have a 
stronger relationship to Pay changes than pay 


levels. 
METHOD 


Samples 
The samples Were comprised of male depart- 


ment managers of à large department store for 
ilable (approxt- 


whom all necessary data were aval 

mately 7096 of the population) in the approPri- 
ate time periods. The time periods and sample 
sizes were 1970 (n= 63), 1971 (n = 68); 1970 


and 1971 (n = 51). 


Measures 

Pay. Pay W35 measured by summing the annual 
zht-tin and bonus (if any) granted 
ediately following his annual 
ion. Pay change was mea- 


ercentage change in salary plus 


Performance. 
formance evaluation score W 
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measure of performance. This score is determined 
through a joint manager-supervisor evaluation 
of the manager's annual performance relative to 
predetermined performance objectives. The date 
of performance refers to the date of measure- 
ment for the previous year's work. 

Job level. Based upon differences in depart- 
mental sales volume and administrative difficulty, 
the organization had classified the managers into 
one of four job levels for 1970 and into one of 
five job levels for 1971. A definite salary range 
was established for each level, though there was 
considerable overlap between them. 

Length of service. Length of service was mea- 
sured by the number of months the manager had 
been employed as a department manager in the 
organization. 

Education. Education was measured by the 
number of years of formal education beyond the 
eighth grade. j 


Analysis 


Partial correlation analysis was conducted to 
assess the relative importance of the independent 
variables in explaining variance in pay levels and 
changes. Multiple correlations were computed to 
assess the combined impact of the independent 
variables on pay levels and changes. 


RESULTS 


Results of the analysis are contained in Table 
1. Job level was the major correlate of pay levels, 
but it was not significantly correlated with 
changes in pay. Performance, on the other hand, 
was more predictive of pay changes than pay lev- 
els. There was à significant negative correlation 
between length of service and pay changes, while 
education was not predictive of either pay levels 
or changes. The multiple correlations indicate 
that the four independent variables accounted 
for approximately 50% of the variance in each 
of the dependent variables. 


Discussion 


As was also found by Lawler (1966), perfor- 
mance levels accounted for a negligible propor- 
tion of the variance in pay levels. The low corre- 
lations probably resulted because pay levels in 
the organization were highly stable (r= 91), 
while performance levels were not (7 = .53). Le 
thus appears that the organization was unwilling 
to change managers’ pay in the degree necessary 
to maintain a high relationship between perfor- 
mance and pay levels. 

As argued previously, low correlations between 
performance and pay levels do not necessarily 
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TABLE 1 


PARTIAL AND MurTPpLE Conamnssics Aw 
IINANTS OF Pay Levers AND 
Pay CHANGES 


Suort Notes 


ALYSIS OF 


Pay 
Variable Level Changes 
i 1970 1971  |1970-1971 
: (n = 63) | (n = 68) | n= 51) 
Performance 19 25% .65** 
Job level .73** .71** 42 
Length of service 43 8 — 30 
.. Education .07 -00 -00 
MultipleR — | ase qae 72 


Tg " P A ^ 
enc rendent variable is Dercentage change in Pay; inde- 
pengenn ariables are 1971 values, 
P <05. 
l M <io. 


j 


Pay change jn th 
anagers with the hi 
formance tended to be rewa 


percentage increases in Pay, even 


ing for the Sets of job level, | 
and education, 
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OF 


WHITE-COLLAR EMPLOYEES AS A FUNCTION 
SOCIOLOGICAL BACKGROUND 


S. D. SALEH ! AND T. SINGH? 


Depariment of Management Scien 


More than 3,000 employees indicate: 


intrinsic and six extrinsic factors. Their 
which they worked were identified. The sample was classified 


communities in 


according to salary level, education, 


(<$10,000) employees whose fathers 
intrinsically oriented than employees 
jobs. The latter group Was, in turn, 
with professional fathers. Moreover, 


relationship was found between intrinsic 


No differences in job orientation we 
a function of either fathers’ occupatio 


This paper investigates the relationship between 
intrinsic job orientation, father's occupation, and 
community size in a sample of white-collar work- 
ers. The father’s occupation is considered an im- 
portant factor in the socialization process and in 
transmitting work values to children (Elder, 
1968). There should be à positive relationship 
between the skill level of the father's occupation 
and intrinsic work values. 

Turner and Lawrence (1965) found that plue- 
collar workers in small-town communiti 
more intrinsically i i 
communities. They suggested that the results re- 
flected the difference in the social values (Prot- 
estant ethic) of their groups. à 

Since the present study included predominantly 
white-collar employees who are generally consid- 
ered middle class, the nature of jobs in the dif- 
ferent communities was expected to be one of 


the main differentiating f job orienta- 
tion. In the organization in which the study was 
conducted, the more responsible and complex 
jobs were more concentrated in larger communi- 
ties than in smaller communities. Assuming that 
job complexity is positively 
orientation, it was expected 
Jarger communities would tr 
oriented than those in smaller communities- 


METHOD 


r this study were obtained 


The data collected fo 2106 


as part of a questionnaire distributed to 
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d their job orientation. by ranking six 


father’s occupations and the size of 


and sex. In the low-salaried group 
held primarily unskilled jobs were Jess 
whose fathers held primarily technical 


less intrinsically oriented than employees 


in the low-salaried group, à positive 
job orientation and community size. 
re found in the high-salaried group as 
n or community size. 


a large, predominantly white-collar 
organization. All employees who were earning 
$10,000 or more at the time of the study (n— 
2.153) were asked to participate. Ten percent of 
the other employees (n = 3,553) were randomly 
selected, Respondents (n— 4.143) indicated their 
salary: 2,444 of them were earning less than $10,- 
000 and 1,699 of them were earning $10,000 or 
more. This represents à return of 69% and 79%, 
respectively. Information concerning father’s oc- 
cupation was provided by 1,830 of the under 
$10,000 employees and 1,447 of the over $10,000 
employees. Information about community size 
was given by 2,030 employees earning less than 
$10,000 and 1,501 employees earning $10,000 or 
more. 

The occupations of the employees’ fathers 
were classified into three general categories: pro- 
fessional, technical, and unskilled. Three cate- 
gories were also used for community size: rural 
communities (villages and towns of less than 
10,000), urban communities (towns and cities of 
from 10,000 to 300,000), and metropolitan com- 
munities (cities with over 300,000). In addition, 
sex, education, and salary level were also used 
as independent variables. The subjects were asked 
to rank 12 job-related factors in the order of 
their importance to them. (Saleh & Pasricha 
1970). Six factors were considered intansi 
achievement and accomplishment, chances for 
promotion, chances for experience and growth 
in skill, nature of work, recognition, and responsi- 
bility. The other six factors were considered ex- 
trinsic—working conditions, security, relation- 
ships among employees, status, salary, and super- 
vision. To convert this ordinal scale into an 
interval scale for each subject, the rankings of 
the 12 factors were changed into their corre- 
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puted for the low-salaried group, where sex, 
education, father’s 


one-way analysis of vari- 
ance was computed for each of the groups of 
university males, university females, and high 
school males, 


= 23.83, df — 2/1819, P «.001). In 


university Subjects, The mean scores were 33.8, 
34.3, and technical, and 
respectively (F= 5.97, 


drst, intrinsic orientation increases along 
the "Toskilled-technieal procis j i 
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the lower class wou 
extrinsic aspects, , 

Although the relationship between the father's 
Occupation and intrinsic orientation was highly 
Significant for the low-salaried group, it was 


highly paid jobs—which are assumed to reward 
intrinsic values—may have overridden any effects 


The effect of community 
that of father’s occupation, 
low-salaried group, the mean 
creased, going from rural (30.6) to urban (31,2) 
to metropolitan (32.2) (P= i 
$ < .001). With respect to community size, the 


among male 
Subjects in 


Communities 
1311, p< 001). 
As expected, the Tésults of the community 
variable show an opposite trend to those found 
by Turner and Lawrence (1965) 
Blood (1967). Both of these studies used blue- 
collar Workers, while the present Study investi. 
gated white-collar employees, Tt 


that both Previous investigations cited reasons 
other than the size of the community for the re- 


cation in the case 
size was also highly Significant (p 
< .001), 

In the Present study, 59.2% of the respondents 
from metropolitan communities had a university 
education, he figure was 46.9% among re. 
Spondents from urban communities 
for respondents from rural communities 
higher educational levels in the larger Con 
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ties could influence the work values of subjects in 
these communities toward being more intrinsic. 

Another mediating variable between com- 
munity size and work values, which may explain 
our results, is job level. As in most large organi- 
zations, the organization in which the present 
study was conducted tended to concentrate its 
high-level jobs in the larger communities. Since 
intrinsic orientation increases with job level, and 
larger communities have more responsible and 
complex jobs than the smaller ones, it was ex- 
pected that larger communities would contain 
more intrinsically oriented people. The absence 
of differences among the high-salaried group may 
be explained in the same way as was done for 
the variable of father’s occupation. 

In conclusion, it appears that variables such as 
education, social class, and early socialization 
affect work values only in the case of individuals 
in lower level jobs. Individuals holding high-level 
jobs, regardless of their past experiences, appear 
to be concerned with the intrinsic aspects of their 


jobs. 
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RELATIONSHIP BETWEEN VARIETY OF WORK EXPERIENCE 
AND PERSONALITY 


CHARLES F. ELTON 1 


University 


Differences in personality test scores 


AND HARRIETT A. ROSE 


of Kentucky 


between groups of college freshmen re- 


porting different varieties of precollege work experience were investigated. The 


subjects (separated by sex) were ra 


ndomly selected in groups of 100 each 


from populations divided into below average, average, and above average work 


experience. The independent variables 


were sex (male and female) and three 


levels of work experience; the dependent variables were Omnibus Personality 
Inventory (OPI) scale scores, Several OPI scores differed significantly (.01) 
by level of previous work experience. These findings were interpreted as op- 
erationally defining the construct of work personality. 


A direct investigation of the personality char- 
acteristics of persons with differing varieties of 
work experience has not appeared in the literature. 
Instead, investigators have attempted to infer this 
relationship indirectly through studies relating 
Personality variables to work adjustment (Lof- 
quist & Dawis, 1969; Neff, 1968). Undoubtedly 
the difficulties in analyzing the variety of adult 


work experience have contributed to the absence 
of research in this area, 


This study investigates 
ality test scores between 
men who differ in the va 
work experience, Specifi 
hypotheses were tested: 
Sex x Variety of Work 
and second, there will be 
ality measures between s 
age, average, and below a 


the difference in person- 
groups of college fresh- 
riety of their precollege 
cally, the following two 
First, there will be no 
Experience interaction, 
no difference on person- 
ubjects with above aver- 
verage work experiences, 


METHOD 
Instruments 


Scores on the following 14 scales of the Omni- 


bus Personality Inventory (OPI), Form F (Heist 


& Yonge, 1968), were ota; š 
Thinking Tee Obtained for each subject: 


: ion (TD), T i RU 
= ry etic (B) pcd ss 
utonomy (Au), Religi 2 e omples " 
cial Extroversion Suplous Orientation (RO), So- 


(SE), I ; ; 
Peron bien (ep ase eio (I), 


Altruism (Am), Practical t s 
linity-Femininity (MF), Outlook (PO), Mastu? 
(RB). 
A variety-of-work-experie 
subject was taken fro; NCE score f 


m the St 9r each 
port (SPR) of the Ar “d 


: ent Profile Re- 
merican College Test (Amer: 

+ Requests for reprints should be sent to 
Elton, College of Education, University 


Charles F. 
tucky, Lexington, Kentucky 40506, 


of Ken. 


ican College Testing Program, ACT, 1970). The 
Work Experience scale consists of the following 
seven items (number in parentheses indicates the 
frequency of combined male and female responses 
to that item by the population from which the 
sample was chosen): worked regularly for pay 
(58%); obtained one or more jobs without the 
help of my parents or friends (54%); had a job 
when my friends couldn't find one (54%); earned 
one or more raises or promotions because of good 
work (29%); changed from one job to another 
because of better opportunities (21%); did a 
job most people my age couldn’t do (20%); and 
supervised the work of others (19%). The stu- 
dent answered “yes” (I did this activity) or “no” 
to each of these seven items, Subjects who failed 
to answer any work experience item were not 


included in the population from which random 
samples were chosen. 


Procedure 


Mean work experience scores and their respec- 
tive standard deviations were computed for 1,330 
male and 1,046 female freshmen (male, M = 
2.96, SD = 1.03; female, M = 1.82, SD = 1.76), 
These subjects comprised the entire 1 
freshman class for whom all lest data were avail. 
able. On the basis of these measures, male and 
female subjects were divided into below average, 
average, and above average groups. An average 
score was one which f 


Y ell within the range of plus 
and minus one Standard deviation of the mean 
Score; an above average score was more than one 
Standard deviation above the mean; 


average Score was more than one stand 
ation below the mean. 


Average work experience 
average scores w 
average s 
average 
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Fic. 1. Standardized mean differences 

average, and above average vari 
Personality Inventory (OPI) sc 
Average, -- 77 7 Below Average.) 


scores were 1, 2, 3, and 4; and above average 


scores were 5, 6, and 7. 

The mean scores on 
jects separated by sex 
three levels of variety 
subjected to a multivariate 
with the ACT Composite Score as the cov: 


the OPI scales for sub- 
(male and female) and 
of work experience Were 
analysis of covariance 
ariate. 


Subjects 

Random samples of 100 each were chosen from 
the following male populations: 333 subjects with 
below average work experience scores, 824 sub- 
jects with average work experience scores, and 
173 subjects with above average work experience 
scores. Likewise, random samples of 100 each 
were chosen from the following female popula- 


ren between subjects with below average, 
jeties of work experience on the Omnibus 
ales. (Legend: .. Above Average, 


tions: 384 subjects with below 

1 Ow average work ex- 
perience scores and 582 subjects with average 
work experience scores. Because only 80 De 
reported above average work experience, these 
subjects constituted the above average group 


RESULTS 


K results of the multivariate analysis of co- 
í mers for the interaction effect of Sex X Va- 
"f Aen Experience was insignificant (F — 
es a 8/1120, p < .72). Thus, it may be as- 
did the die of work experience on the 
varia is 
m ariables is the same for males and 
The results of th ivari 
E. sults e multivariate analysis of c 
variance for variety of work experience was ae 
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nificant (F = 2.86, df = 28/1120, p<.01). The 
univariate differences between groups were ex- 
pressed in standard deviation units and these are 
shown in Figure 1. 

Subjects with above average mean work experi- 
ence scores earned significantly higher mean OPI 
scores on Thinking Introversion and Social Ex- 
troversion compared to subjects with either aver- 
age or below average scores. Subjects with many 
work experiences earned a significantly higher 
mean score on Complexity than did those with 
few work experiences. Also subjects with few 
"work experiences earned a significantly lower 
mean score on Impulse Expression than did sub- 
jects with either average or many work experi- 
ences. Finally, subjects with many work experi- 
ences earned a significantly lower mean score on 
Masculinity-Femininity than did those with few 
Work experiences, 


Discussion 


Freshmen with many work experiences, in con- 
trast to those with few, may be described as 
sociable people who are emotionally sensitive and 
who are tolerant of ambiguities and uncertainties. 
In addition, they are ambitious, aggressive, and 
possess wide-ranging academic interests, 

Neff (1968) and Lofquist and Dawis (1969) 
have proposed the term "work personality" to ac- 
count for differences found in adults who vary in 
their adjustment to work, As used by these au- 
thors, work personality is a hypothetical con- 
struct which attempts to link work adjustment 
to such variables as motivation, ability. 
ality, and previous experience, 
this study that v 


, person- 
; The finding in 
ariety of precollege work experi- 


SHORT Notes 


ence is related to personality measures obtained 
at the beginning of college suggests the usefulness 
of the construct of work personality 

The findings in this study, like m 
which are exploratory in nature, indicate the nes 
for additional research. For example, mosi wo 
available to adolescents does not require special- 
ized ability or previous experience. Presumably, 
the jobs held by subjects who answered “yes” 
to the work experience items are representative 
of unskilled, part-time entry level jobs in which 
personality attributes may have played an im- 
portant part in employment. Whether or not this 
assumption is valid can be determined only by 
additional evidence. Do subjects with above 
average work experience also possess more spe- 
cialized skills and abilities than subjects with 
below average work experience? If all precollege 
jobs were classified by skill level, would subjects 
differ in personality traits between levels of skill? 
Other related research questions are: Do subjects 
differ in their personality characteristics accord- 
ing to age of obtaining first job? Are personality 
attributes related to length of time subject worked 
for same employer? 


hose 
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VISUAL CUES AND VERBAL CONTENT AS INFLUENCES ON 
IMPRESSIONS FORMED AFTER SIMULATED 
EMPLOYMENT INTERVIEWS * 


PAUL V. WASHBURN ax» MILTON D. HAKEL? 


Ohio State University 


Interviewer behavior and mode of information presentation were manipulated 
in complex factorial design to examine the nature and extent of ata sieer 
influence on observer’s perceptions of interviewers and job applicants. Observa- 
tions and ratings made by 122 psychology students were more favorable under 
conditions of increased use of gestures, eye contact, and smiling. Audiovisual. 


visual-only, 


and transcript modes of presentation were found to be differ- 


entially effective in influencing perceptions. 


While it is known that the characteristics and 
behaviors of an applicant during the interview 
can influence the interviewer’s perceptions of 
that applicant, little is known about the extent to 
which the interviewer's perceptions can be inad- 
vertently contaminated by the interviewer's own 
shaping of the applicant's behavior. Eye contact 
(Kleck & Nuessle, 1968), gaze direction (Exline, 
Gottheil, Paredes, & Winklemeier, 1968), posture 
and distance (Dittman, 1962; Ekman, 1964; 
Mehrabian, 1968), duration of speech and silence 
(Matarazzo, Weins, & Saslow, 1965), and verbal 
reinforcement (Greenspoon, 1965) are but a few 
of the many kinds of interviewer behavior which 
have been studied for their impact in interviews. 

This study examines observers’ perceptions of 
job applicants and of interviewers as à function 
of interviewer “enthusiasm,” operationalized here 
as the interviewer's use of eye contact, gestures, 
and smiling. Since this study relies on observer's 
perceptions, it represents only an approximation 
to a direct test of the effect of interviewers 
behavior on interviewer's perceptions of job ap- 
plicants. Consequently, to check on the adequacy 
of the experimental manipulations, this study also 
examines the effects on these perceptions of how 
the information is obtained by the observer. Spe- 
cifically, such differences as a function of mode 
of information presentation—verbal content or 
visual cues—were investigated. 


METHOD 


* Preparation of Stimulus Materials 
ated campus interviews lasting about 
h were recorded with a videotape 
vas supported by National Science 
355 to Milton D. Hakel. 

2 Requests for reprints should be sent to Milton 
D. Hakel, Department of Psychology, Ohio State 
University, 404-C West 17th Avenue, Columbus, Ohio 


43210. 


Four simul 
15 minutes eac 


1 This research W 
Foundation Grant GS-2 


recorder by two teams of graduate students from 
the industrial-organizational psychology pro- 
gram. One member of each team played the role 
of a campus recruiter and the other played the 
role of a graduating senior seeking a sales posi- 
tion in a local firm. 

Each team of actors played their roles under 
two conditions, with instructions to repeat as 
nearly as possible the same verbal content. In 
the first interviews, the enthusiastic condition 
the interviewers were instructed to use gestures 
and smiling frequently and to maintain eye con- 
tact as much as possible. In the second inter- 
views, the unenthusiastic condition, the inter- 
viewers were to minimize the use of gestures. 
smiling, and eye contact. No differential instruc- 
tions were given to the students playing the roles 
of applicants and, in particular, they were not 
told that the interviewers would be varying the 
frequency of gestures, eye contact, and smiling. 

To check the adequacy of the videotapes sad 
to develop suitable instrumentation for measur- 
ing observer perceptions, a pilot study was con. 
ducted. Conditions were alternated to check f : 
any effects of order of presentation and also is 
any differences between the role-playin : t d 
Discussions afterward revealed that p arr 
thought they were seeing videotapes of P - 
campus interviews, not simulations. Bl 


Subjects 


À Ma mig female students from the introduc- 
ory psychology course at the Ohio State Uni 
versity served as observers (total N = 122) a 


Procedure 


Groups of partici 
S articipants were givi i 
E o E e given a question- 
naire and instructed that the purpose of the ex 
periment was to learn about the process of wi 
i 3 1 s im- 
pression formation in employment innen. 
8. 


After familiarizing themselves with the ty È 
ype o 
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TABLE 1 


ITEM MEANs ARRAYED FOR ENTHUSIASTIC VERSUS UNENTHUSIASTIC INTERVIEWERS UNDER THREE 
~ iLi E 
INFORMATION PRESENTATION TOGETHER WITH OMEGA-SQUARE (e?) 


VALUES WITHIN EACH MODE 
—————— — LI 
CARENT E I uc c Eni sedere RR 


Mode of presentation 


MODES or 


i rm Audiovisual Video only | Transciipt 
Questionnaire item | 
arae e EET RN NEM ee 


qe 
Enthusi- Unen- : tJ Unen- 
i? 


a | Enthusi- Unen- 
- 


astic thusiastic astic thusiastic astic thusiastic 


Q7 


Applicant | 
Overall performance 3.67 3.05 073 2.84 3.30 038 3.85 2l | .103 
Estimated evaluation time | 3.15 2.56 066 2.39 1.98 029 2.80 2.56 | .004 

Interviewer | 
Interviewer’s age 27.51 27.56 | .000| 27.58 28.15 | 000} 28 29 31.17 | -060 
Liking of his job 3.89 2.80 |.175| 3.93 221 |15 3.72 3.02 |.076 
Youthfulness 3.66 2.70 153 3.26 2.90 018 3.61 2.97 070 
Enthusiasm 4.08 2:77 255 3.51 2.11 282 3.89 dd 053 
Approachability 3.74 2.56 216 3.51 2.34 214 3.79 3.34 032 
Desirable boss 3.46 2.62 120 3.11 2.31 109 3.48 3.25 |.002 
Personal interest 3.61 2.28 261 3.26 1.89 | 273 3.75 3.16 059 

onsideration 9.25 2.72 1.226 3.20 2.51 |.112 3.64 3.26 |.031 
Intelligence 3.51 3.11 |041| 323 3.02 | .005| 328 2.97 


new questionnaire and viewed a videotape from 
the unenthusiastic condition. Team l and Team 2 


; Condition, were alternated appro riately so that no subject 
ares. Half saw Team PProp J subJec 


^^ d£ Saw any team under both conditions, A fixed 
< SHEY were then Siven à order of presentation for enthusiastic conditions 


TABLE 2 
UMMARY ANOVAs w 
OF 112 x 2 AN OVAs WITH REPEATED M EASURES ON FACTOR A SHOWING EFFE 
ENTHUSIASM AND AUDIOVISUAL 


"ECTS OF INTERVIEWER 
VERSUS VIDEO Moprs OF PRESENTATION 


Enthusiasm (A)a 


N l Ra 
Hew Tode (B)« AXB Sense 
MS p "Amo E e en qood M Sb 
MS p ILS F l 

ese: 

2 15.25 5.31 5.81* 17.85 12.93** 

3 3.32 -n 26.88 22.96** 50 47 1 [^ 

4 25.58 : 8.20 59 2.44 

5 26.89 Lis 62.00 10-3505 E: ti 13.99 (df = 116)« 

6 107.55 ean 37 03 561 m 1.32 

7 70.33 24.93 19,52** : 1.43 

8 36.99 32.86** 2.56 195 1S | AS 1.28 

TE fone 0.84 s 50 37 1.31 

* 3.25** a 1.21 37 
10 46.92 n 9.05 n did .04 1.36 

E ` 3.74* 1.98 574» | p 1.48 1.03 

Kad. — $5 te =< Vibe .23 67 

b Sue in A, df = 1: 0. feti d e Oi ee a 

en = or people Tespondin - T 

A E ar ' "M — to the "age of Ntery le 


SHORT NOTES 


139 


TABLE 3 


SumMARY OF 11 2 X 2 ANOVAs WITH REPEATED ME 


ASURES ON l'ACTOR A SHOWING EFFECTS OF INTERVIEWER 


ENTHUSIASM AND AUDIOVISUAL VERSUS Script MODES OF PRESENTATION 
Enthusiasm (A)* | Mode (B)? A pe 
Item SEDE on SR CES: a DN oO E 
*rror 
MS F MS F MS F MS" 
1 26.23 je 1.33 1.5: brs 
2 10.66 9.47** 1.81 135 (ar oe ks 
3 401.97 14.94** 240.02 10.23** 122.46 4ss* 
4 41.80 25.19** 10 08 148 [5 23.46 (df = 110) 
5 4098 | 2916" 80 77 133 P [n 
6 54.20 41.06** 1.80 1.73 6.89 s 22* bos 
7 39.36 3135" 10.25 741**.— | 7.93 p ae 
8 18.40 15.42** 5.61 3.63 3.94 3.30 Es 
9 49.59 33.22** 13.79 10.57** 10.25 6.86** i^ 
10 28.92 31.59** 1.98 2.07 7.23 7.90** S 
11 4.74 5.62* 1.33 1.47 16 02 ps 


a df =1. 
b Subjects in A, df = 120. 
* p < .05. 
wep « 01. 
was adopted because the pilot study showed that 


the enthusiastic-unenthusiastic order effects were 


not significant. 

'The remaining 61 observers V 
portions of two of the interviews and also read 
"two transcripts. The same sequence of enthusi- 
astic followed by unenthusiastic conditions Was 
used, but the order of treatments (viewing the 
visual portion and reading the transcripts) was 
randomized to counterbalance possible order ef- 


fects. 


iewed the visual 


Questionnaire responses were analyzed in the 
framework of three 2 X 2 factorial analyses of 
variance (ANOVAs) with repeated measures 
Analyses of variance were computed separately 
for each of the 11 items shown in Table 1 ind 
F tests were computed for enthusiasm and mode 
of presentation main effects and the interaction 
Omega-squares were also computed. | 


RESULTS 


Interviewer enthusiasm resulted in consistently 
more favorable mean ratings across all items 


TABLE 4 


vor 112 X 2 ANOVAs WITH REPEATED 


SUMMAR 
INTERVIEWER ENTHUSIASM 


AND VIDEO VERSU 


MEASURES ON FACTORS A AND B SHOWING EFFECTS OF 


s Script MODES OF PRESENTATION 


| 
Enthusiasm (A)* Mode (B)* | AXB* 
See DOD eee 
Item | Error 
MS F MS F MS | p MS» 
a cione: AN RARE 
1 80 67 11.95 14.77** 20.08 ARR 
2 6.56 6.97" 47s | 14.45% Al moe | n 
3 332.20 14.02** 194.05 6.05* 159.47 Mos 95 
4 44.33 29.01** 28.92 25.87** 1.98 143 | 28.72 (df = 58) 
5 16.27 14.57** 1.18 1.06 148 Ej 1.10 
6 60.00 61.29** 40.17 41.85** 9.05 d eoa 1.04 
- 31.02 22.23** 23.05 1801** | 4.46 ae 1.23 
8 an] 19677 3031 | 2282" | 3.21 | 92 1.19 
9 52.33 47.28** 45.18 A519 | 1151 | 2.65 — | 100 
10 - 17.32 26.68** 21.84 2243** | 217 | ee 1.23 
17* 66 08 — | i A 1.00 
=. | | S. 7 Ss 
adf =1 


b Subjects in A, df = 60. 


* 5 <.05 
Kk p «.0l. 
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and unenthusi- 
astic interviewers on all Table 1 items pertaining 
to the interviewer in the audiovisual mode are 
enthusiastic in- 
being younger, liking 
youthful, being more 
of person who could 
more liked in a role 
as a boss, showing more Personal interest in the 
applicant, showing more Consideration, and being 


Interviewer enthusiasm is also related to the 
evaluations of the applicant, so that the overall 
Performance of the applicant was rated higher 
iastic condition than 
ic condition. Of 
are a function 


applicants did not respond to the interviewer's 
enthusiasm. The applicant’s ratings in the Visual- 
only and transcript Conditions show that elimi- 


visual cue does 


keep the verbal 
as similar as pos- 


» 3, and 4, reveals that 


S Seventeen 


20 of sls were significant, 


s in the two ANOVAs whi 
l D f ch 
contrasted the Visual-only Condition with the 
er modes, With few item exceptions, th 
results show me 


al Channel is com- 
ript modes, it is 
uencing observers’ 
gud interaction of enthusiasm and mode 
tests, Ixin nus Significant for 11 of 33 F 
modes involving yj © 11 tests the information 
larger discriminati Presentation resulted in 


+6 ONS betwe "n 
and Unenthusiastic conditions enthusiastic 
ese instances 


visual cues Were more į i 
tent in determining tho edat th 
Returning to Table 1 a. 
are directly interpretable 
ance accounted for in qi n 
from unenthusiastic Conditions 
of Presentation, Averaged 
the variance Was accounted for b; eg. of 
behavior manipulation in t 
tion, 11% in the video- n 


as 


Du UA MEL LS LL. a T 


Norrs 


in the audiovisual condition. These proportions 
show that the combination of visual cues and 
verbal content was maximally responsible for the 
obtained differences in ratings. Also, they j 
cate that for the type of judgments made, vi 
cues tended to be more important than the verbal 
content. 


Discussion 


Although most items were 
lated (e.g., Youthfulness 
higher on al i 
regardless of - This con- 


sistency ably a consequence 
of the time-honored halo error, It is noteworthy 
that simple changes in visual cues and verbal 
content created so pervasive a difference and an 
inflation in the ratings. The implication of this 
finding for applicants, salesmen, politicians, and 
others Wishing to create good impressions is quite 
Straightforward: It's not just what you say, it’s 

OW you say it, 

The results Tun parallel to studies in other 
Contexts where mode of presentation was varied. 
For example, Shapiro (1966) used audiovisual, 
video-only, audio-only, and transcript. conditions 
to study the degree of agreement among obsery- 
ers in their ratings of the pleasantness or un- 
pleasantness of an interviewee's feelings. He con- 
cluded that judgments of affect were influenced 
Principally by information presented in a visual 
mode, 

The results of this study show that both mode 
of presentation and the nature of the information 
presented have an impact on observer's judgments 
of interviewers and applicants. In particular, they 
illustrate the need for standardizing interviewer 
behavior, especially to guard against inadvertent 
contamination of perceptions, Further, the results 
underscore the importance of visual Presentation 
of information, an information channel Which has 
been somewhat neglected in most Studies of deci- 


sion-making Processes in interviewing. 


conceptually unre- 
and IQ), ratings were 
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ANOTHER LOOK AT CONTRAST EFFECTS IN THE 
EMPLOYMENT INTERVIEW * 


FRANK J. LANDY? 


axp FREDERICK BATES? 


Pennsylvania State University 


Several studies have appeared describing possible contrast effects in the em- 
ployment interview. Many of these studies have concentrated on preinterview 


resumé perusal as 
question the adequacy of the dependen 


as 


While the validity of the employment interview 
has been seriously questioned over the past dec- 
ade, it continues to be used as one component, 
and often the primary component, of the selec- 
tion process. There is à general feeling among 
researchers in this area that something is Zong on 
in the interview which is systematic. For both 
practical and theoretical reasons, resarchers have 
begun to look at the dynamics of the employment 
interview. m 

One approach has been to look at the possibi- 
ity of contrast and/or assimilation effects 1n the 
interview. It is not hard to get anecdotal support 
for this possibility from professional interviewers 
themselves. They will invariably state that a 
mediocre applicant preceded by a number of 
“Jemons” looks pretty good—particularly toward 
the end of a recruiting trip! One way of demon- 
strating the existence of this effect, while still 


1 The research 
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an indicant of these effects. The present study 
t variable in these investigations as well 


the nature of the subject population. 


and follow-up 


maintaining some semblance of experimental con- 
trol, is to have interviewers rate resumés Con- 
structed to typify various levels of applicants 
(i.e., resumés describing the typical good, aver- 
age, or poor applicant). On the basis of inspec- 
tion of a particular resumé, a rating is made de- 
scribing the general "suitability" of the appli- 
cant or the “willingness to hire" on the part of 
the "interviewer." 

A study by Hakel, Ohnesorge, and Dunnette 
(1970) is illustrative of this. The primary inde- 
pendent variable is "induced expectation ” or the 
nature of the information preceding a "tan (QR 
stimulus. 'The dependent variable is ^. a 
rating made on the target stimulus, in this case 
a resumé. Any deviation of the rating of the 
target resume from the rating that would have 
been obtained had no previous information been 
presented is considered a function of assimilation 
e enira effects. The judgmental situation 

s three resumés: the first two determine 
the expectation level and the third is the target 
a The resumés are carefully constructed 
A e 7o information likely to 
os y good,” “average,” or “poor appli- 

_Hakel et al. (1970) used two groups of inter- 
viewers—students and professionals. The design 
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was a 3 X 2 factorial, with three levels of pre- 
ceding information (good, average, or poor) and 
two levels of target information (good or poor). 
It was demonstrated that while there were reli- 
able differences in the evaluation of target res- 

umés as a function of preceding information, the 

differences were trivial Preceding information 
accounted for less than 2% of the variance in 
ratings of target resumés. 

Leonard and Hakel (1971) extended the re- 
search design to examine the stability of the 
stimulus value on criterion judgments by in- 
creasing the number of resumés preceding the 
target resumé. These preceding resumés were 
homogeneous with respect to applicant level (e.g., 

- HHHHL as opposed to their original 3 stimulus 
set HHL, where H — high and L — low). They 
failed to find a significant relationship between 
length of preceding stimulus set and criterion 

e ratings. The subjects were student “interview- 
ers. 

Wexley, Yukl, Kovacs, and Sanders (1972) had 
student interviewers view videotapes of inter- 
views and assign criterion ratings to the appli- 
cants, Their criterion rating scale was similar to 
the one used in the Hakel et al. (1970) study. 
Their findings were similar to those of Hakel. 
Statistically significant, but trivial, contrast ef- 
fects were found when good and poor applicants 
occupied the third position. When an average 
applicant occupied the third position, the con- 
trast effects were overwhelming, accounting for 
80% of the variation in criterion ratings. 

The present study was intended to be an ex- 
tension of the Hakel et al. (1970) and the Leon- 
ard and Hakel (1971) studies, It was felt that 
pese ar resumés used in the Leonard and 
E eeu y Tas still too small, and in addition, 

uation would be more realistic if preceding 
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TABLE 1 
MEAN EVALUATIONS oy TARGET REsuMÉS 
Expecta- Target resumé Target 
tions (Study 1) resumé 
induced by (Study II) 
preceding Banna e Augu. 
two rofessionaR| g " 
acts ai ju Professional» 
Good m Good | Poor | Good | Poor 
Good | 7.73 | 2.33 | 733 | 343 | ex 
Average | 7.86 | x1 8.07 | 329 515 
Poor bin | | 8.27 | 3.06 | 78 | 36 
2 90. a 
bn = 60. 
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stimulus information were heterogeneous rather 
than homogeneous. 


METHOD AND RESULTS 
Study I 


The design was a 3 X 2 factorial, with three 
levels of preceding information in the form of « 
resumé (good, average, or poor) and two levels 
of target resumé (good or poor). Two samples 
of subjects were used—students and profession- 
als. The students rated the resumés in groups, 
while resumés were mailed out to professionals 
who did recruiting on the campus at the Pennsyl- 
vania State University.t In each sample 15 sub- 
jects were randomly assigned to each cell, yield- 
ing a total W of 90 for each sample. Students 
were given course credit for participation. The 
response rate for the professional sample was 
7896. 

The resumés and criterion rating scale were 
taken directly from the Hakel study. The task 
consisted of rating 12 resumés. The last of the 
twelve was designated as the target resumé. The 
two preceding resumés (#10 and #11) were used 
to induce the expectation; both were either good, 
average, or poor. The first nine resumés consisted 
of random sequences of good, average, and poor 
resumés—three good, three average, and three 
poor. Each subject received a different random 
sequence of the first nine resumés. The subjects 
were told that the applicants were senior account- 
ing students seeking employment as accountants. 

The results for Study I indicated that neither 
assimilation nor contrast effects were present in 
judgments. An analysis of variance indicated no 
effect for the induced expectation factor in either 
the professional sample or the student sam- 
ple. There was an overwhelming effect on ratings 
as a function of the level of the target resumé for 
both the professional sample (F — 455.64, df — 
1/84, p < .001, estimated w° =.83) and the stu- 
dent sample (F = 263.71, df = 1/84, p< .001, 
estimated o° = 75). The mean ratings for the 
various groups can be found in Table 1. 

The fact that Hakel found statistical evidence 
for contrast effects, while we found none, is of 
little importance, since the amount of variance 
accounted for by Hakel was trivial and the per- 
centage attributable to the main effect of target 
resumé level was identical. 


*In spite of the fact that the professional sample 
Was specifically instructed to complete the ratings 
in the specified order without looking through the 
entire set of resumés, there is no way of insuring 
that this procedure was followed. Four forms were 
returned which had been unstapled by the respon- 
dent; these forms were discarded prior to analysis, 
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TABLE 2 
CRITERION RATING SCALE FOR SruDY II 
Scale 
alee Anchor 

9 Would make every possible effort to contact and talk with this outstanding candidate. Would make a 
special trip to talk to this candidate. 

8 Would term this individual an excellent candidate and ask him to visit while I was on campus. Would 
remain on campus an extra day to talk to him. ; 

7 Would write a personal letter to this above-average candidate encouraging him to sign up for an 
interview or arrange for a special visit. 

6 Would make an extra effort to have the candidate sign up in a vacant spot on the schedule. Write 
a form-type letter encouraging him to sign up. 

5 Would interview the candidate if he signs up and would interview him on a campus visit. 

4 Would interview the candidate if he signs up on his own initiative and if the schedule were not too 
full. 

3 Would not discourage the candidate from signing up, but would not be disappointed if he did not. 
Would take no action to arrange an interview. l 

2 Would take mild steps to avoid an interview or a sign-up. Would talk to candidate only if recom- 
mended by placement official. 

i Would show no interest in candidate under any circumstances. 


As a test of the Wexley et al. (1972) findings 
regarding the nature of responses to average ap- 
plicants as test stimuli, two more conditions were 
added for each of the samples—good, good, aver- 
age and poor, poor, and average in Positions 10, 
11, and 12, Since these cells were added after 
the initial data were gathered, the results were 
analyzed separately with ¢ tests. The means for 
the student interviewers in conditions good, good, 
average and poor, poor, and average differed in 
the direction suggested by Wesley et al., though 
not significantly. The means for the professionals 
in these two conditions were identical. 

An interesting by-product of Study I was the 
fact that over 60% of the professional sample 
spontaneously offered the observation that a 
decision to hire was never made on the basis of 
resumé inspection. Nevertheless, they agreed to 
complete the rating task for “the sake of the ex- 
periment.” Not a single student “interviewer 
voiced any concern for this issue! 

It then became necessary to develop a more 
appropriate criterion to examine the initial ques- 
tion. An additional 25 professionals were asked 
to supply behavioral anchors for a scale reflecting 
willingness of an interviewer to invite an appli- 
cant for a plant visit on the basis of resume 1n- 
formation. Over 50% of the professionals con- 
tacted indicated that such a criterion scale, again, 
was inappropriate. Another 25 professionals were 
asked to supply pehavioral anchors for a scale 
reflecting. willingness to schedule an interview 
with the applicant on the basis of resume infor- 
mation. There were Very few reservations about 


such a criterion, and the resulting scale appears 
in Table 2. 


Study II 


A second study was run with the new criterion 
rating scale. Since neither Study I nor the Leon- 
ard and Hakel (1971) study revealed any sys- 
tematic effect of length of preceding stimulus set 
on criterion judgments, only three resumés were 
used. Data gathering procedures and the design 
were identical to Study I, with the exception that 
only professionals were used. Ten subjects per cell 
yielded a total V of 60. Once again, the analysis 
of variance indicated no assimilation or contrast 
effects in judgments. A great share of the vari- 
ance in criterion ratings was attributable to the 
level of the target resumé (F = 127.39, df = 1/ 
54, p <.001, estimated w° = .67). The mean rat- 
ings for the various groups can be found in 
Table 1. 


Discussion 


? While the present research results 

similar to those of Hakel et al. E pes 
and Hakel (1971), and Wexley et al. (1972), 
Study II is probably a more reasonable test of 
the major questions involved. The previous stud- 
ies used both an inappropriate subject population 
and/or an inappropriate dependent variable. The 
fact that student interviewers had no reserva- 
tions about using an inappropriate criterion scale 
makes their suitability as a subject population 
doubtful. The fact that professional interviewers 
saw hiring decisions as inappropriate on the 


— SHORT 
basis of resumé information makes judgments 
based on a hiring decision doubtful. When the 
analyses were based on an appropriate subject 
population and an appropriate dependent varia- 
ble, neither contrast nor assimilation effects were 
evident. 

The general lack of support for assimilation and 
contrast effects in resumé scanning or vicarious 
interviewing suggests that any additional research 
in this area, if in fact research should continue 
at all, must take a rather different approach. A 
more appropriate avenue might be gathering data 

— from real life, rather than from artificial settings. 
For example, one might look for contrast or as- 
similation effects in a personnel office that has the 
responsibility for hiring hourly paid workers. 
While such an approach is obviously more cum- 
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bersome and less efficient in terms of experimental 
control, it might also prove more fruitful. 
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BEHAVIOR OF TEMPORARY MEMBERS IN SMALL GROUPS 


THOMAS G. 
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= re he on the behavior of temporary group members and the 
DP d A members. of a previously established group 
State Suprene girar s The subjects are judges of the Washington 
wm ae : le indicate that temporary members behaved 
mur a T oward the permanent members of the group and 

e permanent members viewed the temporary newcomers primarily as a 


vehicle for workload reduction. 


The addition of a hewcomer to an established 


Eo mic m motion processes of mutual adjust- 
ke. Seiler NT group structure (Mills, 
ao has argued that the addi- 


M w membe: mes 
dislike by regular r may be viewed wi 


Behringer (1960) 


ansen, 196 
Ulmer (1964) and Snyder (1058) on 


esearch by 
the impact 
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of newly appointed justices to the U. S. Supreme 
Court demonstrated that the newcomer substan- 
tially alters the power relations structure within 
the judicial group. 

Previous studies on the role of the newcomer 
have one important variable in common. Each has 
portrayed the new member as a permanent addi- 
tion to the group. The present study examines 
the role of the temporary newcomer. Whether 
the new member has regular or short-lived status 
may be a significant factor. The temporary mem- 
ber cannot be viewed as a source of permanent 
disruption or as a catalyst for significant change, 
Therefore, when the newcomer’s status is tempo- 
ray, we would anticipate entirely different þe- 
havioral tendencies within the group. Specifically 
two relationships were hypothesized in the present 
investigation. First, the temporary member will 
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act in a compliant manner toward the permanent 
members. And second, the permanent members 
will use the newcomer as à vehicle for workload 
reduction of routine, but will not use him for 
important tasks. 


METHOD 


The subjects of this study were judges of the 
Washington State Supreme Court, a permanent 
collegial tribunal of nine regular members. Its 
chief duty is to adjudicate appeals from the lower 
state courts and to be the final arbiter of state 
law. This particular court was selected for study 
mporary members (in the 


because it often uses ter 
form of lower court judges) to substitute for 
n illness or conflict of 


permanent members whe 
interest causes them to refrain from participation 
in a particular case (Washington State Constitu- 


tion, Amendment 38). 

'The Washington Supreme Court may be con- 
ceptualized as a small work and decision-making 
Tt has a relatively heavy caseload. Its 
-making procedures are similar to other 
that is, attorneys in behalf of 
litigants present written and oral arguments and 
the judges study these arguments individually 
and then meet in conference to arrive at à col- 
legial response to the questions presented. One 
justice is assigned the task of writing an opinion 
explaining the court’s rationale for its decision. 
Judges who are at variance with the majority's 
conclusions may file formal dissents. The written 
opinions and votes of the judges are published for 
public and scholarly scrutiny. ——— ' 

Data were collected on all decisions in which a 
substitute judge participated during the years 
1965-1966, a period during which the permanent 
membership remained stable. Nonparticipations 
for which the temporary judges substituted were 
evenly distributed throughout the permanent 
membership. The resulting sample included. 370 
decisions. Information was gathered on opinion 
writing and voting behavior. From these data we 
were able to discern work distribution and con- 
sensus patterns within the group. In order to test 
for statistically significant differences between 
the behaviors of permanent and temporary group 
members, 2-scores based upon the standard nor- 
mal distribution were used (Freund, 1967, pp. 
285-287). All scores reported are based upon 


one-tailed tests of significance. 


group. 
decision 
collegial courts; 


RESULTS 

at temporary group members 
anner toward permanent 
ned. In order to test for 


'The proposition th: 
act in a compliant ma 
members was first exami 


this possibility, dissent behavior was analyzed. 
It was anticipated that temporary judges would 
express views at variance with the court majority 
at a disproportionately low percentage of times. 
In only 19 decisions in our sample did one or 
more justices register a dissenting vote, represent- 
ing the high degree of cohesion exhibited by the 
Washington court. Of 123 votes cast by the per- 
manent judges in these cases, 31 were dissents 
(25.2%), whereas only one dissenting vote was 
registered by temporary judges out of 19 partici- 
pations (5.3%). This disproportionately greater 
tendency to agree with the views of the majority 
and eschew the opportunity to voice independent 
opinions exceeds chance probability (s — 1.93, $ 
« 05). j 
The proposition that permanent members view 
the presence of a temporary member as an op- 
portunity for workload reduction was tested by 
examining the majority opinion-writing assign- 
ments for each of the 370 decisions in the sample. 
The assignment of a particular judge to develop 
a written rationale for the court's ruling is the 
basic work unit beyond that required for all 
justices who participate in a decision. The pro- 
portion of times permanent and temporary judges 
were assigned the task of writing the opinion of 
the court was calculated. This proportion was 
based upon the number of instances a justice 
voted with the majority; for it is only under this 
circumstance that a judge is available to be 
assigned the responsibility for articulating the 
court's rationale. If our hypothesized relation- 
ships are correct, temporary group members will 
receive a. disproportionately greater share of 
opinion-writing assignments. The data support 
this hypothesis. Permanent judges voted with 
the majority 1,553 times and authored 226 ma- 
jority opinions (14.6%). Temporary members 
however, sided with the majority in 369 instance 
for which they wrote 144 opinions of Mé court 
(39.0%). This tendency to assign greater portions 
of the workload to temporary newcomers is sta- 
tistically significant (z = 10.61, p<.001) 
Finally, it was anticipated that the tendency 
of permanent court members to use temporary 
judges as a means of workload reduction would 
he true only on routine matters. It was hypothe- 
sized that on important legal questions newcom- 
ers would not be allowed to play a substantial 
role by writing majority opinions. It is a well 
cipes fact that appellate court justices relish 
Pu opportunity to write landmark opinions. It 
is not likely, therefore, that the court would per- 
mit a temporary member to gain this valued task. 
Furthermore, significant decisions are often con- 
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troversial, and the prestige of the opinion writer 
can be a factor in obtaining compliance with 
court rulings. This consideration also dictates that 
the court’s spokesman in deciding significant 
cases be a permanent justice. 

The structure of the Washington court makes 
this hypothesis relatively easy to test. Because of 
its heavy caseload, the court normally divides 
itself into two five-judge departments, with the 
Chief Justice serving as a member of both. How- 
ever, when a majority of the court believes that 
a particular case is an important one, the full 
"ine-judge court decides the case. Full court 
decisions are known as en banc rulings. These 
procedures allow an unbiased operational distinc- 
tion between routine and significant cases. Of the 
370 decisions in the sample, 20 were en banc 
rulings. An analysis of opinion-writing assign- 
ments supports the notion that the general ten- 
dency to assign the opinion of the court to the 
temporary judge breaks down when the court 
grapples with legal questions of major signifi- 
cance. In the 20 en banc cases, permanent judges 
voted with the majority 138 times and authored 
all 20 opinions of the court, Temporary judges 
voted with the courts majority 19 times but 
did not receive a single opinion-writing assign- 
ment (z = 1.78, p < 05). 


Discussion 


Based upon data gathered from behavior on 
the Washington State Supreme Court, two major 
propositions have been substantiated, First, a 
temporary newcomer to a previously established 
group assumes a “chameleon” role. He displays a 
pronounced tendency to avoid conflict and votes 
consistently with the majority position within the 
group. Second, the permanent members view the 
introduction of a temporary member as an op- 
portunity to reduce the workloads of regular 
members. This workload reduction process ex- 
tends to normal, routine tasks; but when im- 
portant decision-making situations arise, the tem- 
B al is shunned in favor of permanent 
demus Les no readily applicable theories 

àvior of these temporary group 
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members, some speculation may be helpful. The 
major distinction between permanent and tempo- 
rary members of the Washington Supreme Court 
is one of status difference. The permanent justices 


are members of the state's highest legal tribuna; 


with large amounts of prestige accruing from 
selection to that bench. The temporary justices 
belong to the same judicial fraternity but have 
been placed on inferior state courts. Status dif- 
ferences have previously been shown to affect 
behavior in small experimental groups (Hurwitz, 
Zander, & Hymovitch, 1968) and in collegial 
courts (Walker, 1973). Justices who have at- 
tained permanent status are very willing to dis- 
tribute routine tasks to temporary members. 
Perhaps, due to substitute upward locomotion, 
the lower status temporary member willingly 
accepts these tasks and is reluctant to engage 1 
decision-making behavior that is not in accor 
with the majority of his more highly placed col- 
leagues. 


REFERENCES 
Becker, H. Through values to social interpretatio’: 
Durham, N Duke University Press, 1930. " 
Freunp, J. E. Modern elementary statistics. (Gr 
ed.) Englewood Cliffs, N.J.: Prentice-Hall, 1967. 
Hurwitz, J., ZANDER, A, & Hvwovrren, B- ue 
effects of power relations among group member 
In D. Cartwright & A. Zander (Eds.), Group € 
namics. New York: Harper & Row, 1968. 
Miis, T. Group structure and the newcomer 
Norway: Oslo University Press, 1957. 
Syyper, E. The Supreme Court as a smal 


Oslo; 


p group 


Social Forces, 1958, 36, 232-238. ourt. In 
Urmer, S. S. Homeostasis in the Supreme Chicago: 

G. Schubert (Ed.), Judicial Behavior- 

Rand McNally, 1964. a three= 
WALKER, T. G. Behavioral tendencies in "political 

judge district court. Midwest Journal ^ f 

Science, 1973, 17, in press. " pila tio? E 
Zure, R. C, & Benrincer, R. D. ASSI tions 5 


ol nal 
the knowledgeable newcomer under € Abno 


group success and failure, Journal cine J. 

and Social Psychology, 1960, 60, 288^ EN M. mi 
ZILLER, R. C, Brnrincer, R. D., & g S. Top 

The newcomer in open and closed ae 

of Applied Psychology, 1961, 45, pacem 


(Received May 3, 1972) 


—— 


, Secon 


Journal of Applied Psychology 
1973, val 58’ No. 1; 1427-158 
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The mail questionnaire has been a popular vehicle for field research, Studies 


have dealt with return rates as a function of 
characteristics. The present study varied 

attempt to modify return rates. The characteristics were type 
gree of personalization, and nonmonetary inducement. A chi-square anal) 
for a 3X 3 X 3 contingency table indicated that neither main effects nor inter- 


subject characteristics or stimulus 
three stimulus Properties in an 
Of postage, de- 


actions had any effect on rate of return. In addition, latency of return was 
unaffected by these variables. A replication of the study yielded identical 


results. 


The rate of response to mail questionnaires as 
a function of a variety of manipulations has been 
studied extensively (Champion & Sear, 1969; 
Donald, 1960; Gullahorn & Gullahorn, 1963; 
Linsky, 1965; Simon, 1967). The present study 
sought to consider the effects of type of postage, 
Personalization of initial contact, and nonmone- 
tary inducement, as well as their interactions on 
the rate of return of questionnaires mailed to a 
random sample of the population at large. In 
addition, a replication was planned to verify any 
resulting relationships. 


METHOD 


'Two samples were used to test the general hy- 
pothesis that there is a relationship between the 
three variables mentioned above and the rate of 
return for a mail questionnaire. The first sample 
was drawn randomly (Underwood, 1966) from 
the pages of the Pittsburgh telephone directory. 
While it is obvious that this is not exactly a 
random sample of the population at large (e.g., 
those with unlisted numbers and those not hav- 
ing phones would be missing), it was felt to be 
Sufficient for answering the questions posed. Sub- 
TUM Were then randomly assigned to one of 27 

els for a total primary sample size of 810? A 
Sample of the same size was identified 
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? Although 30 subjects were initially assigned to 
Sach cell, the analysis was corrected for undeliverable 
letters, n 


from the pages of the Philadelphia telephone di- 
rectory. Once. again subjects were randomly 
drawn and assigned to one of the 27 cells for 
a total sample size of 810. This was the replica- 
tion sample. 

The questionnaire was an attitude scale dealing 
with attitudes toward movies (Thurstone, 1930) 
and was specifically chosen for its noncontro- 
versial nature. It consisted of 40 items with 
which the subject was to agree or disagree. An 
inspection of the items indicated that, in spite of 
its age, the scale was perfectly appropriate for 
present-day use. 

There were three antecedent or independent 
variables—type of postage, personalization, and 
incentive—each having three levels: (a) type of 
postage—Air Mail, first class (stamped), and 
first c (metered); (D) personalization—dit- 
toed cover letter, mimeographed cover letter, and 
individually typed cover letter; and (c) incen- 
tive—inclusion of a short article dealing with 
perceptions of movies (relevant condition), in- 
clusion of a short article dealing with scaling 
problems in paired comparison scaling (irrelevant 
condition), and no inclusion. With reference to 
the first variable, type of postage, each of the 
three levels was used on the outgoing qued 
naires only; all subjects received envelopes i 
dressed to the experimenters with a first-cla 

ermit for return postage. n 
n The 27 different combinations of me ro 
three variables at their three different wb 
stituted the cells to which the subject 
randomly assigned. 


The primary samp a col 
nated after 50 days. Replication 


n was termi- 


le data collectio collection 


data 
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rns indicat related 


4 An examination of the retu ns ded item 
than 10% of the respondents 
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was terminated after the same time interval had 
elapsed. 


RESULTS AND DISCUSSION 


The data were analyzed by using chi-square 
tests suitable for the consideration of contingency 
tables of a 3X33 magnitude (Castellan, 
1965). This enables one to consider the possible 
interactions which may be Systematically related 
to the rate of return. 

The overall rate of return for the primary and 
replication samples was 25% and 17%, respec- 
tively.® This is slightly lower than the usual rate 
of return for unsolicited mail questionnaires in 
the population at large. The depressed rate may 
be due to the relatively neutral nature of the 
questionnaire, 

The results in terms of the treatment variables 
are rather unequivocal, Of the seven possible chi- 
Square tests of main effects, two-way interac- 
tions, and the three-way interaction for the pri- 
mary sample, none were significant at the 05 
level. Exactly the same results were found in the 
analysis of the replication sample. 

A post hoc test of the Proposition that the 
treatment variables affected the latency of re. 
turn was made, The two samples were trichoto- 
mized on the basis of how long a period of time 
elapsed before the questionnaire was returned. 
Once again, all Chi-square tests Were nonsignifi- 
cant. There was no relationship between any of 
the treatment variables and the Speed with which 
the form was returned. The same Was true for 
the analysis of the replication sample. (All possi- 


three variables at their 
three different levels were tested.) 


€ results are obvious. 
ely appealing the notion 


rate of re- 
€ question- 


no way dependent on 
n treated either as main effects 


; the Suggestion wi 


the Speed with which th 
in 


these factors whe 
or In combination 
Hence 


© would give to anyone 


SHort Norrs 


contemplating a questionnaire study resembling 
the situation we described, in terms of sampling 
and type of questionnaire, would be to send the 
forms out with metered-mail postage, a dittoed 
cover letter, and no additional enclosure about 
the topic area. It’s cheaper, faster, and works 
just as well! 

There are, of course, some qualifications which 
must be made. The nature of the questionnaire 
may possibly act, as either a main effect or in 
combination with the other variables considered 
in this study, to affect rate and/or speed of 
return. 

In addition, if one uses the notion of incentive 
in its strictest sense and holds the reward until 
after the appropriate response has occurred, the 
incentive factor might account for a greater por- 
tion of the variance in rate of return. 

Finally, if one is concerned with a stratified 
sample of some sort rather than one drawn from 
the population at large, the results might be 
quite different, 

It is hoped that the results presented in this 
study might lead researchers to question those 
generally accepted, intuitively appealing assump- 
tions which affect, in a very practical sense, the 
way research is carried out, 
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EVALUATION OF AN ABILITIES CLASSIFICATION SYSTEM 
FOR INTEGRATING AND GENERALIZING HUMAN 
PERFORMANCE RESEARCH FINDINGS: 
AN APPLICATION TO VIGILANCE TASKS ' 
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The tasks used in 53 studies in the vigilance literature were classified in terms 
of the abilities required for task performance. For studies falling within cack 
category of task, mean performance computed across studies was plotted as 2 
function of time in the vigil. The curves relating time in the vigil to detection 
accuracy were found to differ as a function of the ability requirements of the 
tasks. Similarly, when the effects of selected independent variables (e.g., signal 
rate, sensory mode, and knowledge of results) on performance were examined. 
different functional relations were found depending on the abilities required 
by the tasks. Thus, classification of these experimental tasks by an abilities 
taxonomy improved generalizations about the effects of independent variables 
.on vigilance performance; relations were revealed which had been obscured 


without these task classifications, 


There is a continuing need to make more 
effective use of behavioral data generated by 
human performance research. This need is 
intensified as more research is conducted and 
the available body of human performance 
literature grows. In particular, better ways 
are needed to generalize research findings 
from laboratory studies to operational set- 
rom one experimental study to another, 


tings, f al stu 
and from one operational situation to another. 


"There are serious limitations in the extent to 
which we can do this in the human perfor- 
mance area. As a result, it is difficult for those 
in operational settings to make predictions 
‘about factors affecting human performance 
from knowledge of the performance research 
literature. Similarly, it is difficult for research- 
ers to develop general principles about factors 
affecting human performance which can serve 
as a basis for further theoretical and scientific 
development. The essential problem is the 
need to generalize the effect of some training, 
procedural condition, from 
ffect on one task, to its 
probable effect on some other task. What has 
been lacking, it is felt, is a system for clas- 
sifying such tasks which would lead to im- 
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proved generalization and prediction (cf. 
Fleishman, 1967b). 

The assumption underlying the work in the 
present study is that a system for classifying 
tasks can be developed which would allow 
more dependable prediction of the effects of 
independent variables on task performance 
within and between classes of tasks. Such a 
system would be especially valuable in mak- 
ing most effective use of available data and 
for predicting performance on new tasks. The 
development of such a taxonomic framework 
employing a language for describing taska 
common to many different basic and applied 
areas, should improve communications among 
researchers and applied personnel and help 
organize human performance information. An 
additional benefit deriving from such a 
taxonomy is the identification of gaps in exist- 
ing knowledge, whereby future research can 
be directed on how given factors affect 
performance on particular kinds of tasks. 

, The focus of this article is on the prelim- 
inary evaluation of a particular taxonomic 
system In terms of its capacity to organize a 
portion of the data found in the human 
performance literature. The objective was to 
examine the feasibility of structuring an area 
of literature according to the “abilities” 
required for task performance. The rationale 
and potential for classifying tasks in terms 
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of ability requirements has been described 
elsewhere (Fleishman, 1967a, 1972). Abilities 
are general traits of the individual which pro- 
vide him with the capacity to perform differ- 
ent tasks. Such abilities are inferred primarily 
from reported factor analyses of human 
performance in cognitive, perceptual, and 
motor tasks. A task classification system 
based on human abilities has undergone con- 
siderable developmental research and eval- 
uation. Of particular importance for the 
present effort have been earlier efforts to 
develop reliable, anchored scales for observers 
to use in estimating ability requirements from 
task descriptions (Theologus & Fleishman, 
1971; Theologus, Romashko, & Fleishman, 
1970). 
The area of human performance selected 
for examination was that of sustained atten- 
tion or vigilance behavior. Vigilance is gen- 
erally considered to involve a change in the 
detection of infrequent signals over prolonged 
periods of time. The signals are usually 
temporally and spatially random in character 
(Buckner & McGrath, 1963). The character- 
istic finding in vigilance research is the dete- 
Noration of performance with time in the 
task, 
The vigilance area 


was selected for study 
for several reasons. Fi 


rst, while a variety of 
different tasks have been used in previous 
vigilance research, the range of tasks is not 
as great as in many other areas. C 
the number of differ 
not likely 
reduce th 
Secondly, 


large number of 
Studies hay. b: 


e been reported on the effects of 
independent variables on 
ance, and it is important 
es be available to provide a 


of data points withi 
: 1 x s n task 
categories, Thirdly, 4 common dependent 


measure is employed in i 
S nearly a ilance 
studies—that is, det x d is 


S ection ac is 
essential to the integrati ie I 
studies that a comm 

The study was d 
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variables in vigilance periormance were pos- 
sible as a result of such task classifications. 

Specifically, the present study (a) examines 
and categorizes the tasks utilized in „the 
vigilance literature in terms of the abilities 
required by these tasks, (5) plots perfor- 
mance as a function of time in vigil for 
studies falling within each category of task, 
and (c) evaluates the effects of three inde- 
pendent variables (signal rate, sensory-input 
mode, and knowledge of results) on vigilance 
performance for different task categories. The 
three independent variables selected have 
been frequently manipulated across studies 
of vigilance and have demonstrated generally 
consistent effects. Performance is generally 
enhanced by increased frequencies of signal 
presentation, by multiple | sensory-input 
modes, and by providing knowledge of 
results (Davies & Tune, 1969). 


METHOD 
Selection of Research Studies 


A literature search of the past 12 years was 
carried out and 195 articles were identified. Criteria 
for study acceptance were developed and applied. 
These criteria included adequacy of task description, 
manipulation of independent variables of interest, 
use of the performance measure “accuracy of detec- 
tion” or a similar measure, and the presentation of 
Performance data over time. Quality filtering of the 
studies according to these criteria yielded 60 accept- 
able studies. Of the 135 studies eliminated, 20% 
were rejected because they used performance 
measures that could not be transformed into detec- 
tion accuracy, 19% were rejected because they faile 
to present performance data over time, and 61% 
were rejected either because none of the preselected 
independent variables were manipulated, the task 
description was not presented in sufficient detail, OP 
the experimental procedures were inappropriate 9' 
inadequate. 

A final review of the resultant data set was then 
conducted. The purposes served by this review b 
(a) to verify that each study met all of the criter® 
established and (b) to identify and eliminate any 9 
the studies containing data anomalies. The set of # 
ceptable studies was further reduced to 53. For mete 
study, the description of the experimental Bara 
involved was examined to estimate the prim 
ability required for task performance. 


Selection of Abilities 
" S infrequen: 
Since vigilance entails the detection of ifr me 
randomly appearing signals over a prolonge er- 
Period, the ability domains considered. were 
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ceptual and cognitive. The nature of vigilance 
performance precludes the involvement of physical 
abilities and minimizes the importance of psycho- 
motor abilities. Within each of the two ability 
domains, two abilities were selected as best 
representing aspects of vigilance performance. 

The two abilities in the perceptual domain and 
their respective definitions were as follows: 

1. Perceptual speed. The speed with which sensory 
patterns or configurations are compared in order to 
determine identity or degree of similarity. Com- 
parisons may be made either between successively 
or simultaneously presented patterns or configur- 
ations or between remembered or standard configu- 
rations and presented configurations. The sensory 
patterns to be compared occur within the same sense 
and not between senses. 

2. Flexibility of closure. The ability to identify or 
detect a previously specified stimulus configuration 
that is embedded in a more complex sensory field. 
It is the ability to isolate the specified relevant 
stimulus from a field in which distracting stimulation 
is intentionally included as part of the task to be 
performed. Only one information source is utilized, 
This ability applies to all senses with the restric- 
tion that both the relevant and distracting stimu- 
lation occur within the same sense modality. 

The abilities and their definitions in the cognitive 
domain were as follows: 

1. Selective attention. The ability to perform a 
task in the presence of distracting stimulation or 
under monotonous conditions without significant loss 
in efficiency. When distracting stimulation is present 
in the task situation, it is not an integral part of 
the task being performed but rather is extraneous 
to the task and imposed upon it. The task and the 
ant stimulation can occur either within the 
same sense or across senses. Under Conditions of 
distracting stimulation, the ability involves con- 
centration on the task being performed and filtering 
out of distracting stimulation. When the task is 
being performed under monotonous conditions, only 
on the task being performed is 


irreleva 


concentration 
involved. i . 

2. Time sharing. The ability to utilize information 
obtained by shifting between two or more channels 
of information. The information obtained from these 
sources is either integrated and used as a whole or 
retained and used separately. 

Tasks in the vigilance literature requiring per- 
ceptual speed were exemplified by Eason, Beardshall, 
and Jeffee (1965) and Baddeley and Colquhoun 
(1969). The earlier study required subjects to press 
a switch when they detected a flash of light which 
appeared in à 1-inch circular hole and lasted Tor 8 
second. The light was normally on for .5 second 
and the interval between flashes was 3 seconds. 
Baddeley and Colquhoun had subjects inspect a 
series of 1,200 displays, each comprised of six discs, 
and press a response key under any disc which Was 
17% larger than the other five discs. Displays were 
presented at 2-second intervals ior a 1.8-second 
duration. Examples of tasks in which the predom 


inant ability was flexibility of closure included Baker 
(1963), who required his subjects to press a micro- 
switch each time there was a brief stoppage (from 
-2 to .8 second) oí the second hand of a clock 
whose face was 20 centimeters in diameter, and 
Adams (1965), who had his subjects flip a toggle 
switch whenever they detected a 2-millimeter blip 
of light appearing in the center of a s-inch white 
screen for 1 or 2 seconds. 


Procedure 


The application of the ability classification system 
to a body of literature involves the determination 
of the extent to which an ability is required for task 
performance. Adaptations of the ability rating scales 
developed by Theologus et al. (1970) were used to 
estimate the ability requirements of each task. The 
scale contains a detailed definition of the ability and 
its distinctions from other abilities. It is a 7-point 
scale with anchor points represented by empirically 
scaled task examples (see Theologus et al., 1970). 
Reliabilities of the scales and preliminary validation 
against empirically derived factor loadings of tasks 
rated have been presented earlier (Theologus & 
Fleishman, 1971). In rating a particular task, two 
basic questions must be answered: (a) Is the ability 
required for performance? (b) If so, to what extent? 
The ability ratings are then ranked in order to 
identify the most important ability for task per- 
formance. An arbitrary but rational criterion was 
established for the classification of a study into an 
ability category. The ability ranked highest in im- 
portance (the dominant ability) had to be rated at 
least 5 (between moderate and high on the scale) or 
have a value 2 scale points higher than any other 
ability required for task performance for the study's 
inclusion into that ability category. This criterion 
for selecting studies made it highly likely that the 
dominant ability was essential to task performance. 

Studies were divided into 4 ability categories based 
on the predominant ability required for task perfor- 
mance. This produced 25 experiments involving per- 
ceptual speed, 25 involving flexibility of closure, 6 
involving selective attention, and 2 involving di 
sharing. There was a larger total number of exper- 
iments than studies, since severa] studies reported 
multiple experiments. In the analyses to be discussed. 
the results for selective attention and ti ‘sh ring 
ability categories must be viewed wi TEESE 

on with caution and 

only as preliminary Suggestions, since the number of 
reed p aped 2 Pisa rigs was quite small. 
dia Wee ea ai Cassified, the performance 
ae ach study within each category were 
examined. Specifically, median percent detections at 
each 10-minute interval up to 3 hours through the 
vigil Were computed and plotted. It was then pos- 
sible to evaluate the relation between detection 
accuracy and time in the vigil for experiments 
involving different classes of tasks 

Median detection a curacy was also computed for 
each study in each task category across the pre 
selected levels of the three independent variables of 


cae 


- Median Percent correct detections as a 
function of time in the task for studies in which the 
predominant ability is Perceptual speed. 


interest. These data, plotted in 10-minute intervals, 
Permitted an evaluation of the effects of signal rate, 
Sensory mode, and knowledge of results upon per- 
formance in the vigil as a function of different task 
categories. E 

A Supplementary analysis involved the classification 
of tasks in terms of both a predominant ability and 
a secondary ability, 
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Median percent Correct detections at each 
10-minute interval through the first 90 or 180 


Curves were fitted by eye to 
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oing into each 
—the greater the number of 
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Classification in Terms oj Abilities 


Median percent correct detections through 
the first 90 minutes of the vigil WEEE comz 
puted for all studies categorized according K 
one of the four predominant abilities y ee 
for task performance. Figures 1 and 2 be 
these median points along with the range hs 
values dispersed around the medians for b 
classified as perceptual speed and flexibility 
of closure, respectively. The figures present 
percent correct detections as a function of 
time in the vigil for the two ability categories. 

For tasks in which the predominant ability 
was perceptual speed, Figure 1 suggests that 
the performance decrement occurred pri- 
marily within the first hour of the vigil and 
that subsequent time in the task led to no 
further performance deterioration, Figure 2, 
which deals with tasks involving the pre- 
dominant ability of flexibility of closure, sug- 
gests that after the initial hour of perfor- 
mance degradation (which is similar to that 
shown in Figure 1), performance was 
enhanced with time in the vigil. Furthermore, 
the range of values about the median points 
demonstrates that tasks requiring flexibility 
of closure resulted in less 
ability than did tasks involving. perceptual 
speed. In both Figures 1 and 2, initial detec- 
tion accuracy was approximately 80%, and 
it deteriorated to about 65% by the end of 
the first hour, Beyond 60 minutes, perfor- 
mance on perceptual speed tasks remained at 
about 65% accuracy, while performance for 
tasks requiring flexibility of closure increased 
to a level approximating initia] performance. 

There was too little data to permit inter- 
pretation for the abilities of selective atten- 
tion and time sharing, Overall, the trend of 
these data was similar to that found for 
Nexibility of closure tasks, . 

The data Suggest that detection gans 
in a vigilance task deteriorates up to a nical 
point in time, then is enhanced when vigilan 
tasks require the abilities of flexibility ° 
closure, selective attention, or time sharine. 
However, when tasks require perceptual d 
as the predominant ability, the ponam 
decrement levels off and does not reverse 
least for the first 90 minutes of the vigil). 


performance vari- 
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Classification. by Abilities and Independent 


Variables 

The data contained in Figures 1 and 2 were 
next partitioned according to levels a each 
of the three selected independent variables. . 

Signal rate. Figure 3 depicts percent nate 
detection as a function of time in the vigil, 
with signal rate as a parameter for t 
requiring the predominant ability of either 
perceptual speed or flexibility of closure. At 
all three levels of signal rate, differences exist 
in the functional relationships between time 
in the vigil and performance for the two 
ability categories. : 

For low signal rates (<1/minute), perfor- 
mance on perceptual speed tasks decreased 
linearly with time in the task, while perfor- 
mance on flexibility of closure tasks demon- 
strated a sharp decrement early in the vigil 
and leveled off after the first hour. obses 
tion, average performance on flexibility of 
closure tasks was lower than that on per- 


ceptual speed tasks. 


ns as a function of time in the task and ability category 
for each signal rate. 


For moderate signal rates, studies requiring 
perceptual speed for successful task perfor- 
mance demonstrated a gradual performance 
decrement with time in the task, while studies 
in which flexibility of closure was required 
showed a small degree of performance en- 
hancement with time in the task, at least up 
to the first 90 minutes, Overall, performance 
was more accurate for both the flexibility of 
closure and the perceptual speed categories at 
moderate signal rates than at low signal rates. 

At high signal rates (>2/minutes), per- 
formance dropped very rapidly for both per- 
ceptual speed and flexibility of closure tasks. 
For perceptual speed tasks, however, perfor- 
mance leveled off with time in the task. 

Sensory mode. Figure 4 depicts percent cor- 
rect detections at 30-minute intervals 
throughout the first 180 minutes of the vigil 
with sensory mode as the parameter. This 
independent variable was trichotomized into 
auditory, visual, and auditory-visual redun- 
dant categories. Regardless of whether the 
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ability category was perceptual speed or 
flexibility of closure, overall performance was 
superior under auditory conditions rather 
than visual conditions. Furthermore, the 
redundant condition was markedly superior 
to either auditory or visual presentation 
when the main ability required for task per- 
formance was flexibility of closure, There 
were insufficient data to generate a function 
in the redundant condition for perceptual 
speed tasks. 

The auditory condition revealed a marked 
differentiation between the relationships 
describing performance as a function of time 
in the task for studies involving perceptual 
speed and those involving flexibility of clos- 
ure. For Perceptual speed tasks, a severe 
performance decrement was obtained with 
time in the task, up to 90 minutes, Alter- 
nately, there was a very small performance 
decrement for flexibility of closure tasks 
within the first 90 minutes 
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For the visual condition, the function 
describing performance with time in the task 
for perceptual speed studies was similar to 
that obtained for the auditory condition. 
However, the flexibility of closure function in 
the visual condition was almost the reverse of 
that for the auditory condition. That is, for 
studies in which flexibility of closure was the 
predominant ability, performance was con- 
stant during the first 90 minutes of the task, 
then a marked deterioration began to occur, 
Knowledge of results. Figure 5 shows 
medians computed across studies falling into 
either perceptual speed or flexibility of closure 
categories for knowledge of results and no 
knowledge of results conditions. Overall, the 


conclusion that knowledge of results is 
superior to no knowledge of results was 


supported. The number of data points avail- 
able for establishing the functional relation- 
ship between percent detection and time in 


and an increment the task for KS Was 
E ` perceptual speed tasks was 
in performance accuracy beyond that time. extremely small and, therefore, did not 
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permit interpretation. Functions were drawn 
for tasks involving flexibility of closure. 
When knowledge of results was provided, 
there was a very small initial decrement in 
performance followed by a leveling off and 
subsequent improvement in performance 
accuracy. In the no knowledge of results 
condition, on the other hand, the performance 
decrement was moderate and consistent 
through the first 90 minutes of the vigil, 
after which no further decrement occurred. 
Summary. While the data of Figures $-5 
are preliminary and in several instances 
functions are based upon only a few data 
points, it is nevertheless possible to infer that 
the effects of the independent variables on 
ce in a vigilance task are in part a 
f the class of task imposed upon 
has been demonstrated that 
when studies are categorized by abilities 
required by the task, relationships between 
performance and time in the task differ 
markedly as a function of the selected 


independent variables. 


performan 
function 0 
the subjects. It 


ategory for each condition of knowledge of results. 


Classification by Multiple A bilities 


Figure 6 depicts three functions, each 
describing detection accuracy with time in the 
vigil for studies classified in terms of both a 
predominant and a secondary ability. A sec- 
ondary ability was defined as one which was 
rated second highest relative to the predom- 
inant ability. Two of the functions relating 
performance to time in the task denote the 
predominant ability of perceptual speed and 
a secondary ability of either selective atten- 
tion or time sharing. 

These functions, based upon medians for 
all studies falling into these two classifica- 
tions, are different. When time sharing was 
the second most important ability, the rate 
of performance deterioration over time was 
markedly greater than it was when selective 
attention was the second most important 
ability. In addition, for the perceptual-speed- 
time-sharing combination, performance lin- 
early decreased as a function of time in the 
task, while for the perceptual-speed-selective- 
attention grouping, the function describing 
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performance with time in the task leveled off 
at approximately 90 minutes into the vigil, 

The third function in Figure 6 shows the 
relationship between performance and time in 
the vigil when flexibility of closure was the 
predominant ability and selective attention 
was the second most important ability. This 
function indicates that performance deteri- 
orated up to the first hour in the task, then 
improved with additional time in the task. 
This function should be compared to the one 
in which selective attention was also the 
second most important ability, but perceptual 
speed was the predominant ability. In this 
function, performance leveled off after 90 
minutes, 


SUMMARY AND CONCLUSIONS 


This study was conducted to provide a 
preliminary evaluation of an 


gories of ability requirements, However 
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abilities did result in markedly different 
performance functions over time. These func- 
lions can be viewed as hypotheses which may 
be tested in future experiments. 

Performance, measured in terms of percent 
of signals correctly detected, typically de- 
creases as a function of time in the vigil. 
Although this finding has been repeatedly 
demonstrated in the vigilance literature, 
previous research had not indicated whether 
or not the nature of this function differed for 
different categories of tasks. By classifying 
tasks according to one of four primary abil- 
ities required for task performance, differ- 
ential relationships between performance and 
time in the vigil were obtained. The most 
notable difference among the functions was 
that performance deteriorated up to a certain 
point in time, then became enhanced when 
vigilance tasks required the abilities of 
flexibility of closure, selective attention, or 
time sharing; but for tasks which required 
perceptual speed, the performance decrement 
did not reverse. 

In the present study, when task perfor- 
mance was partitioned by levels of three 
independent variables (signal rate, sensory 
mode, and knowledge of results), marked 


a small set of differences in the functional relationships 
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emerged for the two primary ability categories 
of perceptual speed and flexibility of closure. 
The impact of an independent variable on 
performance was a function of the abilities 
required by the task. 

Tasks were classified not only by the 
primary ability required for performance, but 
were also classified jointly in terms of a 
primary and a secondary ability. Functional 
relationships developed according to primary 
ability categories were somewhat modified by 
consideration of a secondary ability in con- 
junction with the primary one. This finding 
implies that the consideration of multiple 
abilities required for performance of a task 
might alter the functional relationships 
developed strictly on the basis of a single 
predominant ability. 

Tt should be emphasized that despite the 
differences among specific tasks in terms of 
equipment, displays, response requirements, 
etc, the classification system enabled an 
integration of results and the development of 
functional relationships that were otherwise 


obscured. 
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EFFECTS OF SLEEP LOSS AND STRESS 
UPON RADAR WATCHING * 


BENGT BERGSTROM, MATS GILLBERG, axp PETER ARNBERG 


Institute of Military Psychology, Stockholm, Sweden 


Detection performance in a 40-minute radar-watching task was studied using 
20 soldiers, divided into two matched groups. The experimental group was 
deprived of sleep for 78 hours, and both groups were then subjected to stress 
induced by unpleasant electric shocks. It was hypothesized that the effects ol 
sleep loss and stress oppose each other through de-arousing and overarousing, 
respectively. Results indicated significant impairment of performance when sub- 
jects were deprived of sleep, but they indicated an improvement under stress. 
Changes were accompanied by small but reliable heart rate reduction and 
elevation, respectively, thus lending support to the hypothesis. 


There is some evidence that the impaired 
performance of a sleep-deprived subject 
improves if a secondary stressor, for example, 
noise, is introduced (Corcoran, 1962). This 
finding could be explained in terms of arousal 
theory, namely, if sleep loss acts by de-arous- 
ing and short-term stress by overarousing. 
Since this theory holds that performance is 
related to arousal level according to the in- 
verted U function (Malmo, 1959), the two 
stressors should tend to cancel 
performance should improve. 

In an experiment by one of the present 
authors (Bergstróm, 1972), the combined 
effects of sleep loss and short-term, threat- 
induced stress were studied. The subjects 
performed a missile-type tracking task in 20- 
second trials. The pulse rate data suggested 
that the stress and the sleep loss opposed each 
other by overarousing and  de-arousing, 
respectively. Since tracking performance, 


unaffected by the sleep 


out and 


mance, 
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result of lapses or microsleeps, which are short 
periods during which the subject is asleep 
without being aware of it (Cannon, Drucker, 
& Kessler, 1964; Williams, Lubin, & Good- 
now, 1959). Microsleep can be measured in- 
directly by noting failures to respond or 
directly by an electroencephalogram | (Jo- 
vanovié, 1971). The two major effects of the 
lapses are slowed speed in a subject-paced 
task and errors of omission in an exper- 
imenter-paced task. In the subject-paced task, 
the response can be deferred until the lapse is 
Over, but in the experimenter-paced task; 
Where the stimulus is present for a short time 
only, a lapse coinciding with the stimulus will 
produce an error of omission. Vigilance tasks 
are typically experimenter paced (Wilkinson: 
1960). It seems likely, then, that the occur- 
rence of microsleep did not affect the above 
mentioned tracking task, first, because of 
short duration and, second, because there W4® 
a 30-second interval between trials (the pe 
was essentially experimenter paced, but fs 
tween each trial the subject was given a 
edge of results and told to get ready for t 
next trial), eit? 
The primary purpose of the present exper” 
iment was to study the combined effects ae 
hours of sleep loss and threat-induced § a 
upon vigilance, namely, 40-minute pu^ 
monitoring. It was hypothesized that Joss 
tion performance degrades under sleeP The 
and improves during short-term stress “py 
Performance changes are accompany, 
lowered and heightened arousal. respec 
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METHOD 
Subjects 


Approximately 100 draftees of a Swedish Army 
company volunteered as subjects. To minimize the 
risk that some subjects would give up the experiment, 
30 of the most reliable men were selected using offi- 
cers’ ratings and interviews as a basis. All of them 
scored average or better on the induction test (gen- 
eral mental ability). The group of 30 was then given 
preliminary tests on the radar watching task. The 
results of these tests as well as the individual pulse 
rates were used to form two matched groups with 
10 subjects in each. 


Design and Procedure 


The study was designed so that the experimental 
subjects’ (experimental group) performance could be 
evaluated against their own results on the first day 
and against the performance of the control subjects 
(control group) who did the same tasks but slept 
as usual. The experimental procedure for the exper- 
imental group is shown in Table 1. The control 
subjects went through the experiment two weeks 
before the experimental subjects. The former were 
tested from 9 a.m. to 3 p.m. on five consecutive 
days (stress was introduced on the fifth day). Mea- 
sures were taken so that the control group could not 
communicate their experiences of the experiment to 
the experimental group. 

The experimental group 
from 6 a.m. on the first day 
day, that is, on the average 78 hours of sleep loss. 

It was necessary to inform the subjects in advance 
that the experiment would involve sleep deprivation 
and that they would receive unpleasant electric 
shocks. Since the subjects knew when the experiment 
period was to end, they were aware that. four nights 
represented the maximum deprivation period. 

The sleep-deprived subjects were watched con- 
tinuously by one of the experimenters and two offi- 
cers, and they kept alert by playing various games. 
by taking short walks out-of-doors, or by having 
discussions. In neither group were hard physical 
exercises employed. Coffee was not allowed, and tea 
was not served within the two hours preceding any 
test session. Smoking was allowed between sessions. 

The subjects of the experimental group Were paid 
150 Swedish kronor (SKr) and those of the control 
group 50 SKr for finishing the experiment. In addi- 
tion, all subjects received 2 SKr per detected target, 
but in case of a false alarm 2 SKr were deducted 
from the pay. An experimental subject could carn 


230 SKr at the most. 


was deprived of sleep 
to 3 p.m. on the fourth 
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TABLE 1 


EXPERIMENTAL PROCEDURE FOR THE 
EXPERIMENTAL GROUP 


| Hours without. 


Day | Session Time of testing 
| | sleep 
1 1 | 9am-3pm. | 6 
2 2 9am-3p.m. | 30 
3 3 9am-3pm. | 54 
3-4 4 | 9pm.-3am. | 66 
4 | 5 9a.m-3p.m. | 78+ stress 


brought in. A new experimenter, who adopted a strict 
and almost unfriendly manner, was introduced. He 
read the following instruction: 


We are now going to measure vour performance 
at the same time as you receive strong electric 
shocks. You may get one or several shocks at any 
time during the session. The shocks are designed 
to give intensive pain. Yet try to report the tar- 
gets as quickly as you can. If you have any ques- 
tions, ask them now: You must then keep silent 
throughout the session. 


Concentric electrodes were strapped to the subject’s 
right leg, just below the knee. The electric stimulation 
was a 70-cycle-per-second alternating current (square 
wave) with 1.4-millisecond pulse length and of 1.5- 
seconds duration. The intensity of the shocks was 
adjusted to seven times the individual sensation 
threshold value, which was measured immediately 
prior to the stress session. Shocks were delivered at 
irregular intervals eight times during the sessions (no 
shocks were given less than one minute before the 
appearance of a target). 


Apparatus 


Simulated radar tracks were generated on the 
cathode ray tube display oí the Digital Equipment 
PDP-12 computer. To achieve some resemblance 
to real radar tracks, the blips were displayed with 
decreasing intensity. The picture was virtually noise 
irec and no sweep was shown. Five equivalent ses- 
sions were prepared and recorded on videotape. 
Hence, the two groups were given identical sessions, 
but any one subject did not see a particular tape 
more than once. 

The subject sat in front of a 9-inch television 
monitor with a viewing distance of 1 meter. A radar 
ta get consisted of five blips, 10 millimeters long. 
The time interval between each blip was 1 second, 
and after the filth blip the target remained visible 
for another second. Eight targets appeared irregularly 
in each 40-minute session, The subject’s task was to 
report the presence of a target by pressing a button 
as fast as possible. The heart rate was recorded every 
second minute by the San-Ei Type PM 101 pulse- 
meter with earlobe sensor. 
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RzsuLTS 
Detection Performance 


Figure 1 shows the average detection prob- 
ability by the first, second, third, fourth, and 
fifth blip with sleep loss as the parameter. The 
dotted line represents Session 5, that is, 78 
hours of sleep loss and stress, Figure 2 depicts 
the average detection probability for each of 


four consecutive periods during the five 
sessions, 
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There were no significant differences be- 
tween the groups in the first session (6 hours), 
and almost 100% of the targets were detected. 
During the following sessions all targets but 
one were detected in the control group. In the 
experimental group, however, detection per- 
formance deteriorated markedly. After 30 
hours of sleep loss the average detection per- 
centage was about 90, and after 54 and 66 
hours the figures were 80 and 70, respectively. 
The differences between the first four sessions 
were significant. (all significances reported are 
p < .01, sign test). T 

The threat-induced stress in Session 5 di 
not affect performance in the control gro 
but improved detection probability in m 
experimental group significantly, compared Š 
Session 4 and also to Session 3. The B 
session was characterized by an initial ius 
provement of performance. During the ei 
15 minutes all targets were detected, but i d 
that more and more misses were recor Es 
Although this session represented a significa” 
improvement with regard to the third ail 
fourth sessions, the performance was re 
significantly inferior to that of the con 
group, who detected all targets. 
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Heart Rate 


The average heart rates for the two groups 
are shown in Figure 3. The differences be- 
tween groups in the first four sessions were 
relatively small but significant. In the stress 
session the heart rate increased significantly 
in both groups. 


Discussion 


The detection performance deteriorated 

gradually during the 78-hour vigil but im- 
proved noticeably by the stress session. The 
improvement was rather temporary, and the 
number of detections again decreased toward 
the end of the session. Quick adaptation, 
however, is often found in laboratory stress 
situations (see for instance Bergstrém, 1970). 
The performance changes in the experimental 
group thus conform with the first part of the 
proposed hypothesis, that is, that sleep loss 
and stress act antagonistically in a vigilance 
task. . 
A point to note here is that Session 4 took 
place at night, and hence the difference be- 
tween this session and the stress session may 
partially be due to the circadian rhythm of 
alertness. Unfortunately, it is not possible to 
separate the effects of sleep loss and circadian 
rhythm through the present experiment 
design. The conclusion that stress really im- 
proves performance is, however, also sup- 
ported by the significant difference between 
Session 3 and the stress session. The former 
took place at the same time of day as the 
latter. 

The second part of the hypoth ; 
that different detection rates are accompanied 
by arousal changes, was supported first by the 
fact that the experimental group had sig- 
nificantly lower heart rates during sleep loss 
and second by the elevated arousal during 


esis, namely, 


stress. 

The differences between groups during sleep 
loss were, however, relatively small, and no 
progressive lowering of arousal was hiat in 
the experimental group through Sessions de 
4. One explanation for this finding is that 
the effects of sleep loss, contrary to those of 
stress, are not in fact mediated pesi 
arousal or, rather, the type of arousa Tw 
sured as heart rate. The original arousa! the- 
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Fic. 3. Average heart rate for a pretest (matching) 
session and the five experimental sessions. (E-Group 
= experimental group and C-Group = control 
group.) 


ory insists that electrocortical, autonomic, and 
behavioral processes occur simultaneously as 
a unit. There is evidence, however, that these 
processes are different forms of arousal and 
that one is not a valid index of the other 
(Lacey, 1967). 

The results from the present experiment 
indicate clearly that the detection perfor- 
mance on a relatively simple watch-keeping 
task deteriorates rapidly when the subjects 
are sleep deprived. Moreover, the data sug- 
gest that only one night's sleep loss is suffi- 
cient to impair this vigilance task, whereas 
normally rested subjects easily maintain a 
100% detection rate. This implies that the 
performance of, for instance, night-working 
radar operators may degrade to an unaccept- 
able level and that special means must be 
used to keep them alerted. 

The stress conditions in the experiment 
brought about an improvement of the sleep- 
deprived subjects’ performance. One would 
expect, then, that the tired crew of a military 
radar station might raise its efficiency (at 
least temporarily) during an emergency, for 
example, during an enemy attack. 
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EMPLOYEE REACTIONS TO A PAY INCENTIVE PLAN 
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The reaction of a work group to a pay incentive plan is studied. An analysis 
of the employees’ attitudes reveals that they trust management, inderstand 
the plan, and see a close relationship between their pay and their ait aie: 
Based upon expectancy theory, it is hypothesized that because these conditions 
exist, the workers will respond directly to the economic payoff structure of the 
plan. In order to test this hypothesis, a mathematical model is developed to 
predict the productivity of the work group. The data show a high degree of 
fit between the model’s predictions and the actual productivity of the group 
The implications of this for future research and for the design of incentive 


systems are discussed. 


Research in the area of incentive systems 
has shown that pay incentive systems often 
do not lead to the productivity levels they are 
designed to produce (Lawler, 1971). Group 
norms, mistrust of management, inappro- 
priateness of economic incentives, lack of a 
clear connection between pay and perfor- 
and incongruence with the wider 
em are a few of the reasons 
why incentive systems fail (Lawler, 1971; 
Roethlisberger & Dickson, 1939; Rothe, 
1960; Whyte, 1955). As Whyte (1955) has 
noted: “Financial incentives are both a tech- 
nical engineering and a human relations 
problem [p. 261]." Combining the technical 
and human aspects of a job situation in order 
to design an effective incentive system is a 
complex and little understood process. 

Expectancy theory (Atkinson, 1964; Law- 
ler, 1971; Vroom, 1964) provides a guide for 
understanding the conditions under which an 
economic incentive system might be success- 
ful. The key elements of the expectancy 
theory model of worker motivation are defined 


as follows: 


mance, 
management syst 
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1. (E — P). The expectancy that success- 
iul performance is possible if effort is 


expended. 

2. (P > O). The expectancy that successful 
performance will lead to an outcome. 

3. (V). The valence or attractiveness of an 
outcome. 


The model states that these terms combine 
as follows to determine motivation: (E — P) 
x X[(P — O)(V)]. This model has a number 
of implications for the conditions that are 
necessary if incentive systems are to motivate 
effective performance. 


1. The rewards offered by the system must 
be tied to a type of performance that 
employees believe they can influence and to a 
level of performance that employees believe 
they can achieve. Otherwise (E > P) will be 
low and this will reduce motivation. 

2. The incentive system must provide a 
clear relationship between performance and 
rewards. If the rewards are not clearly related 
to performance, the employees will have a low 
motivation to perform, 

3 The incentive system must be perceived 
as relating more positive than negative out- 
comes to effective performance. The model 
emphasizes that individuals consider all the 
outcomes that are associated with effective 
performance. Sometimes high production is 
associated with negatively valued outcomes, 
such as rejection by co-workers, and as a 
result the positive value of the rewards may 
be countered by the negative value of the 
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other outcomes associated with high produc- 
tion. 


Every pay incentive plan has an actual 
payoff structure that can be objectively 
stated, and the conditions listed above make 
it clear that the structure must have certain 
characteristics if it is to create perceptions 
that will motivate effective performance. Most 
importantly, there must be a strong connec- 
tion between pay and performance, Also, the 
plan must tie significantly larger amounts of 
money to good performance than to poor 
performance. Finally, the plan must be based 
on measures and standards which are reason- 
able. 

Tt often is technologically difficult to create 
plans that have the necessary characteristics. 
Sometimes performance is difficult to measure 
and to relate to pay. Often technology, tools, 
and/or the type of coordination used on a job 
restrict the employee’s freedom to control his 
performance level. 

The psychological problem in designing a 
financial involves creating 
he employee will 


such a 


comes he receives 


the costs to him of Setting these 


outcomes, 


In order for the employee to r 
plan, he 


© complex or is pre. 
le manner, it will not 
, the employees must 
will not change the 


if they produce 


communication m 
be present so that. when told oe 


will believe they can meet t 
Finally, the gener 
zation must be Such that 
against high production a 

One way of testing th 
psychological conditions must 
before people will respond positis 
incentive plan is to analyze an eff, 
tive plan. If the theory 


at certain 
be Present 
ely toa pay 
ective incen. 
Presented here js 
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correct, it should have the characteristics 
specified by the theory; on the other hand, 
an unsuccessful plan situation should not. The 
present study was designed to test for the 
presence/absence of these features in sit- 
uations where there are successful and unsuc- 
cessful incentive plans. In addition, it will try 
to determine if the employees actually do 
respond to the structure of an incentive 
system when the proper psychological condi- 
tions exist. This will be done by building a 
mathematical model of employee behavior 
which assumes that employees do respond to 
the structure of the incentive plan. The results 
generated by this model will be compared to 
the actual behavior of employees in a work 
group in which the conditions exist. 


METHOD 
Procedure 


The study began when the Harwood Company 
(see Coch & French, 1948; Marrow, Bowers, & 
Seashore, 1967, for descriptions of Harwood) asked 
the researchers to study a shipping group in their 
Marion plant which they felt had an effective incen- 
tive plan. Tn order to study the plan, three types oí 
data were collected, First, the researchers gathered 
Productivity records for the 12 years prior to the 
time of the study, and overtime, absenteeism, turn- 
over, and grievance data for the previous year. 
Second, the researchers gathered information on the 
nature of the incentive system, The information 
describing the incentive em came from inter- 
views with the managers who ran the system ane 
from company records, which included a description 
of the initiation of the system and a description © 
the technical details of the plan. 

Finally, attitudinal data were gathered from the 
Work group to examine how the employees Pet 


1 H H H . i ive 
ceived their working situation and the incen 
plan. In order to measure attitudes, a questionnal 
and a 


emistructured interview were developed pi 
on preliminary interviews with four employees. pii 
of the employces was interviewed, and each of s 
filled out the questionnaire (one man chose not 
be interviewed, but he did fill out the questions 
In order to develop comparisons that could ses, 
"sed to evaluate the nature of the men's TEP kod 
two other groups were studied. One group po d 
in the same factory but did a different kind of si nine 
This group was given the questionnaire to de 
il the shipping group was representative of the doin£ 
Orce in the plant, A group in another plant pla 
similar shipping work under a similar incentive D of 
Was also given the questionnare, and one quar his 
the members of this group were interviewed: feel 
Stroup was chosen because management did Tus ; 
that their incentive plan was working; SES 
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TABLE 1 


DESCRIPTION OF SAMPLE GROUPS 


Scale Reliability 


Protestant ethic (Blood, 1969) 

Nonprotestant ethic (Blood, 1969) 

Internal-external (Gurin, Gurin, Lao, 
& Beattie, 1969) E 

Lower order needs | .88 

Higher order needs (Hackman & 
Lawler, 1971) 

Average age 

Average education 

Average vears in company 

Average no. children 

Average years in position 


.65 
6 


-68 


] 
| Marion shipping | Other shipping | Marion group 
| group (x = 10) | group (n = 16) | other (n = 11) 
| 5.2 54 54 
| m &5* 4.9 
3.5 2.9 2.6 
| 6.2 | 4 6.4 
I 
| ES 53 51 
39.0 33.0 46.0 
| some high school | some high school | some high school 
17.0 9.0 18.0 
| 1.9 1.8 | 1.7 
| 15.0 5.0 18.0 


Note. The personality scales were ada 
^ All reliabilities were computed 
* These differ at the .05 level. 


provided an interesting comparison with the Marion 
shipping group. 

All the data for the study were collected during 
regular working hours. The employees were told 
that the study was being done by Yale University 


and that their individual responses would be 
confidential. 
Subjects 
Table 1 presents demographic and personality 


data on the three work groups studied. As can be 
seen, all three groups are composed of family men 
who have high seniority. The major demographic 
difference is between the two shipping departments. 
The Marion group tends to be composed of older, 
higher seniority workers. 

Table 1 also presents scale scores on some per- 
sonality measures. These measures were only used 
to determine if the groups differ from each other 
and show only one significant difference" between 


the three groups." 


Description oj the Psychological Climate of 
the Pay Systems 

The psychological climate in which the incentive 
systems operated were measured by six individual 
questionnaire items and three scales. 

The importance of pay to the subjects was meas- 
ured by a three-item scale (Spearman-Brown reli- 


ability = .71). 


5 The other Marion group was included in order to 
determine if the Marion shipping group was similar 
to other groups in the plant. Since the two Marion 
groups were very similar on both the demographic 
and the personality measures, it was assumed that 
the workers were part of the same populations. The 
will not be used in further 


other Marion group 


analyses, 


pted from the original sources for this research, 
ng the Spearman-Brown prophesy formula. 


How important to you is it to earn a good 
income? 

How important to you is it to get the best pay 
you can? 3 
How important to you is it to have a fair pay 
plan? 1 


The subjects! perception of their actual pay was 
measured by a scale using the same three items with 
the stem, “To what degree do you... ?" (Spear- 
man-Brown reliability = .89). 

The value to the subject of his work group was 
mez d by a two-item scale (Spearman-Brown 
reliability = .58). 


We have a good group here, 
Our group works well together, 


The remaining items are presented in Table 2 


Work 


The work done in both shipping departments is 
manual unskilled labor. The men take the products 
from the production floor and put them on shelves 
for storage. When an order comes in, they take the 
products off the shelves, put them into boxes, seal 
the boxes, and get them ready for shipping. At the 
end of the day, trucks come in and pick up the Boxes 
um of the men has a primary job such as putting 

he products on the shelves or packing the boxes. 
However, the men can do all of the different tasks; 
xs work load is heavy in one area, the men shift 
jobs. 


Incentive System 


. The Marion group’s incentive system is a group 
incentive. The hourly pay rate increases (above a 
minimum) by 50% of a base rate cach time the men 
Increase. their production by 100% of a base. The 
productivity base was historically determined and 
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TABLE 2 


Group Means DESCRIBING WORK ATTITUDES 


. Other | 
. | Marion | Shipping , 
Scale or item | group group a 
| E = 10) | Gy" = 16) 


If I try hard, I can do my | 


job well 6.0 56 | .78 
Importance of pay 6.1 6.4 m 
Actual pay 59 | 49 1.68* 
Group value 6.0 4.6 4.48*** 
How much I earn depends | 

on how hard I work 6.0 4.6 2.83** 


I can easily figure what I 
should be paid at the end 
of the week 5.8 3.6 3.374* 

No matter how much we 
produce the company will 


never change the pay plan $0 | 3.6 = 87: 
I've been treated fairly by 
this company 62 | 56 | 148 
1 work very hard during | 
busy times 61 | 62 | —.28 
Note. The possible range on all these scales is from 1 to 7. 
*p €.10. 
**) «0l. 
e$ 9:001. 


had been in effect for 11 years at the time oí the 
study, Productivity is determined by using a simple 
formula that had also been in effect since the incen- 
tive was installed. The total amount that the group 
produces in a week is primarily determined by orders 
for goods, but the number of hours taken to ship 
this total is dependent on the speed at which the 
men work, Historically, management has not tried to 
keep down overtime; the group has been left alone 
in determining the number of hours it will work 
(overtime is paid at time and a half based on the 
pay rate which is used for the week). 

The incentive system in the other shipping group 
was structurally similar to the system for the Marion 
group. It was a group incentive which related the 
worker’s pay rate to the amount that they produced. 
ipd this plan was much more complex than the 
men ie account differences in the 

y of E different products, and the 
mechanics for calculating the actual rates of pay 
Were more complex, As in the Marion plan, the rates 
Were based on historica] data and had ‘not been 


altered since they were s 
V were set five yı efore the time 
B ihe aut, ve years before the time 


RESULTS AN 


D Discussion 
Attitude Measures 


sae 2 presents mean job attitude scores 
or the two shipping Sroups. The men in both 


groups seem to believe that they i 
productive if they try. In senile de ly 


Which the work is done is direct 
by how hard each man w 


do my job well? was 6.0 on a 7 


, : -Point 
agreement. During the interviey p Scale gf 


Vs, the men all 
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responded that the job was a relatively easy 
one for unskilled labor. 

The incentive system does appear to be 
based on a reward that is valued by members 
of both of the work groups. Both the inter- 
views and the questionnaire indicated that 
pay was important to the men. 

In the Marion group, there do not appear 
to be any negative social outcomes tied to 
performing well. In the interviews they spoke 
favorably of their supervisor and the other 
group members (three members of the group 
compared it to a family), and they empha- 
sized that the group and the supervisor sup- 
ported higher productivity. The other ship- 
ping group differed. In the interviews it 
became clear that their supervisor was not 
well liked and that the men did not feel that 
they were a close group. The latter difference 
between the groups is reflected in the mean 
answers of the two groups on the group value 
scale (6.0 vs, 4.6, p < .01). 

The Marion group appears to see a clear 

connection between their effort and the 
rewards they receive. The mean responses of 
6.0 to “how much I earn depends on how hard 
I work” and 5.8 to “I can easily figure out 
what I should be paid at the end of the week" 
indicate that this connection is quite clear. 
. The other shipping group provides an 
important contrast. Their plan was more 
complex, and the mean of this group was 
significantly lower on both items. In the inter- 
views, the men reported that they were dis- 
satisfied with the plan because they could not 
understand it and because they could not see 
the connection between rewards and effort. 

Neither of the shipping groups seemed t? 
fear that the company would actually change 
the nature of the plan. They indicated slight 
disagreement with the rather strong item 
"no matter how much we produce, the = 
pany will never change the pay plan” (3. 
and 3.6); but in the interviews all of the me” 
indicated that they did not expect the com 
pany to change it. Furthermore, the P 
Seemed to trust the company to deal fal? 
With them. These attitudes have a clear n 
in fact, since the incentive plan had not bee 
altered over a number of years. Jod 

The Marion work group has had 2 eth 
chance to learn how this plan operates. V 
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one exception, which will be discussed later, 
they have been deciding how to behave under 
essentially the same conditions for a period of 
approximately 12 years. It is worth pointing 
out that under the plan the men felt that they 
were well paid and that they worked hard in 
return, 

Based on these data, it seems clear that the 
Marion shipping group operated in a psycho- 
logical environment where the men should 
respond to the basic economic payoff struc- 
ture of their pay plan, while the other ship- 
ping department did not. The men in the 
Marion shipping department understood their 
plan, felt it offered positively valued rewards 
that they could achieve, trusted management, 
and did not see any negatively valued out- 
comes attached to productivity. 

The other shipping department met some 
of the conditions for a favorable psychological 
environment; however, in a few critical areas 
they did not. The major problem was that the 
men did not understand their pay system and 
the relationship between their own behavior 
and their rewards. It should be mentioned 
that this psychological problem was probably 
the result of the pay system's technical design. 
The pay plan was quite complex and difficult 
to understand. This complexity was increased 
since it was a group plan, which further added 
to the difficulty of seeing any relationship 


between a workers productivity and his 


rewards. - 
If the propositions presented at the begin- 


ning of this article are correct, the Marion 
group should respond to the economic nature 
of their incentive plan, while the other ship- 
ping group should not. The prediction that 
the Marion shipping group did respond to the 
structure of the plan will be tested in the next 
section. As far as the other shipping group 
was concerned, all of the information the 
researchers were able to gather indicated that 
it was not succeeding. 'The workers did not 
seem to respond to it, and the management 
in that plant was convinced it was a failure. 
Regrettably, the researchers were not able to 
collect the data to fully test this conclusion by 
building a decision model for this group in 
order to show that the workers did not 


respond to the economic incentives. 
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Productivity Model 

If the Marion shipping group is actually 
responding to the economic structure of the 
incentive system, its production should follow 
a rational decision model that would, up to 
certain limits, lead to their maximizing their 
pay. The limits are imposed by the costs 
associated with the high effort and the loss of 
leisure time which would be associated with 
high productivity. One way of determining 
if, in fact, the employees are behaving in a 
rational way is to see if their behavior can be 
approximated by a rational decision-making 
model. Such a model was developed and is 
described below. 

Based on the interview and questionnaire 
data, it was decided to build a simple utility 
model. * Three different characteristics were 
included in the utility model: pay, effort, and 
leisure, The men all indicated that they con- 
sidered pay to be the primary positive out- 
come which they received from their job. One 
of the primary costs was the extensive over- 
time during some portions of the year (i.e., 
lost leisure time). Effort was included since 
this is a “cost” to the men of working hard. 
This factor came up a few times in the inter- 
views, although not as frequently as the other 
two. For purposes of the model, effort was 
assumed to be proportionate to productivity. 
Using these three factors, the utility function 
employed was: 


U: = af(APi) + Bg(P) + y h(ot;/m;), EN 


where 4, B, y are weights and are assumed 
to be greater than zero. 


AP; is the average pay per man for the 
week. 

0.1 > nwver: d ivi 

P; is the average productivity per man 
hour for the week. 

ol; is the average number of overtime hours 
for the group for the week. 

m, is the average number of men working 
for the week, 

oli/m; is the average number of overtime 
hours per man for the week. 

lU; is the average utility per man for the 
week. 


* The utility model is a simplified expectancy model 
where (E — P) = 1 and (P — 0) = 1 so that the ex- 
pectancy force is the sum of the valences of the 
outcomes (i.e. their utilities). 
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There are a number of aspects of the 
system which influence pay, but over which 
the men have no control. These are the num- 
ber of dozens to be shipped during the week 
(Di), the number of men doing the shipping 
Oni), the number of hours spent doing non- 
incentive — (nonchargeable) related — tasks 
(nc;), and the number of chargeable, but 
nonovertime hours to be worked (/;). On the 
other hand, the men do control how hard they 
work (P;) and thus, indirectly, the number 
of overtime hours which they work (ot;). The 
reason they control overtime (ot;) indirectly 
is that orders have to go out. Thus the group 
has to work long enough to ship all dozens 
(D;) no matter how many there are. If they 
work harder, this decreases the amount of 
overtime necessary to do all of the shipping. 

If we assume that the variables the men do 
not control are given constants for the week, 
we can develop the relationships between 
average pay (AP;), overtime per 
(ot;/m;), and productivity (P;). 
relationships are 


man 
These 


AP, 


P; — 15 
- Ie $ Ew D S ELI 


Mi ; [2] 
Although it may not be immedi 
this expression reflects the str 
Incentive plan 
described, 


ately obvious, 
ucture of the 
which has already been 
The average rate of Pay increases 
by 50% above a base rate each time produc- 
tivity increases 100% above its base. An 
individual’s total pay is determined by taking 
the number of nonovertime hours times the 
Pay rate for the week, plus 1.5 (time and a 


half) times th 
s S the number of overtim "s 
times the base rate, e hours 


Using the stru 
as à basis, it is 
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à -D 
FREF (=) E] 
mP” 
where f, g, and h are the utilities of pay, 
effort, and lost leisure respectively. 5 
This implies that the relationship between 
productivity and the other variables will be 


pue (PAIL My) — D 
2-a-f t 
2 af 


where & and c are parameters which can be 
expressed in terms of af, yh, and fg. 


Fit of the Model 


The equations in the previous section 
described a model of how a rational decision- 
making group would optimize their utility 
based on certain parameters relating to pay, 
leisure, and effort. The problem is to deter- 
mine if this model is a valid predictor of the 
decisions made by the group. The validity of 
the model will be examined in two ways. Firsts 
the predictions made by the model will be 
compared to the actual decisions which the 
group has made in the past. Second, the 
form of the model will be examined to see if it 
is reasonable, 

In order to see how well the model fits = 
actual decisions made by the group, the data 
for each normal week over the year before 
the study were used. This sample of weekly 
data was split randomly into two halves. € 
first half was used to fit the parameters & an à 
€ to the data. In order to do this c was chose! 

5 A complete description of the model and of D. 
assumptions which underlie it can be obtained: FU 
the first author, The model basically assumes | ure 
effort is proportional to productivity and that b 
is negatively proportional to overtime. Also, the § ere 
of the utility functions of pay, effort, and leisure ol 
assumed to be constant, Other slopes were tried, 
this assumption worked as well as any and app 
10 be the least complex. 


aret 


EMPLOYEE REACTIONS TO A Pay INCENTIVE PLAN 


and then 
D/L(ue — 1/2) + em] 


was regressed on p* to obtain &. This proce- 
dure was followed for a range of cs 0 > c > 
— 300. This gave a result which had maximum 
predictability (in correlational terms) for 
c = —100, k = —120 (these are only approxi- 
mate answers since a trial-and-error, iter- 
ative technique was used to obtain them). The 
correlation coefficient between the dependent 
variable (P?) in the regression and the 
expression above for the best fit values of & 
and ¢ was .83 over the 22 weeks used in this 
half of the sample. The parameter values 
were then substituted in the equation and the 
model was cross-validated on the other half 
of the sample. The correlation between P? and 
the predicted values was .88 over the remain- 
ing 21 weeks in the sample. 

These results indicate the model has con- 
siderable empirical validity for the production 
data used. The model is able to explain an 
impressive amount of the variance using only 
two parameter values. Thus, there is justifi- 
cation for saying that a utility model 
describes the decision making of this group. 

The next test of the model is to see if it is 
reasonable. There are two ways in which this 
model can be unreasonable. First, the struc- 
ture of the model may not make sense. Sec- 
the parameter values which result 


ond, ë ; 
when the model is applied may be absurd. 
clear that the 


Looking at Equation 3, it is not I 
structure of the model is sensible. It is hard 
to believe that any employee would make a 
decision that relates productivity to the vari- 
ables on which the decision is based in such 
a complex manner. However, it is important 
to remember that Equation 3 is a derivation 
of the basic structure of the model and thus 
may not clearly reflect it. Equation 2 pro- 
vides a better basis for examining the struc- 
ture of the model. It describes the basic out- 
line of the ways in which the group’s utility 
will change as its productivity changes. . 
Equation 2 says several things, First, it 
savs that as the total amount to be produced 
(D;) goes up, the optimum level of produc- 
tivity and the total utility will be higher. In 
other words, the group should produce more 
and become more satisfied (greater utility), 
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the more there is to produce. This equation 
also says that if nonchargeable hours are 
large in relation to total normal working 
hours, the men will benefit by working harder. 
This is reasonable since if they work less 
hard (ie. a lower P), this will lower their 
rate of pay not only for their productive 
hours but for all the nonchargeable hours as 
well. The extra overtime earned by working 
less hard may not compensate for the loss. 

What happens to the men's utility as they 
increase their productivity given Di, nc, ti 
and m;? As the men increase their produc- 
tivity, they lower their utility by lowering the 
total amount they get paid (but at a decreas- 
ing rate as P gets larger). This happens be- 
cause, while they get paid at a higher rate as 
P gets larger, they lose overtime (at time and 
a half). The fact that it happens at a decreas- 
ing rate means that the smaller the amount of 
overtime, the less they lose by working faster. 

Also as P increases, utility decreases be- 
cause the men have to work harder. However, 
as P increases, the utility of the group is 
increased because it has more leisure (at an 
increasing rate of P). The fact that leisure 
increases utility more when P is small is rea- 
sonable because it says that the value of an 
hour of leisure is greater when the group is 
working a lot of overtime (ie. has little 
leisure). All of these implications of the model 
are reasonable and they indicate that the 
structure of the model is believable, 

The actual parameter values of the model 
provide another indication of its validity. 
They imply that 


308 g 
"UR = — 100 
and 
Haf + 60v n 
—— =— 12 
"PT. 120. 
Thus, 
af = — 33 Bg 
and 
è 4 
E es ec E, 
a J 10 * t 
and 
yh = 15.7 B:g. 


This implies that the weight of a'j (i.e, 
pay) in the utility function is approximately 
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dozens of pajamas.) 


three times that of Beg ( 
approximately 1 that of yh: (Ie; overtime). 
Similarly the weight of y-k (overtime) is 
approximately 16 times that of B-g (effort). 
These results Seem reasonable 
grounds. First, during the interview 
was mentioned less than pay and 
and probably does not have as 
impact on the utility function as 
two elements, Second, there is evidence that 
overtime is more important than pay in deter- 
mining the optimum leve] of utility, Before 
1967 the incentive system included a ceiling, 
The group's rate of pay would only increase 
until a certain level of productivity had been 
achieved. Beyond this point incr 
ductivity did not lead to 


Le, effort) and 


on two 
s, effort 
overtime 
great an 
the other 


eases in pro- 
increases in pay 


(although they obviously did lead to decreases 
in overtime). In the summer of 1967, the ceil- 
ing was lifted. The effects of this on produc- 
tivity are shown in Figure 1. The effects on 
productive hours are show. 
As can be Seen, there 
productivity 
data were not 


ninFigure2. — 

was an increase in 
(P < 001). Overtime hours 
available for this year, but the 
decrease in productive hours should come phe 
marily from overtime hours (¢ is approx- 
imately constant since there were no large 
changes in the number of men during dm 
time), Before 1967, men could not get D 
for working harder beyond a certain point. = 
a result, they chose to work less hard and ge 
more overtime (and thus more pay). Tor 
ever, when they were given the chance to gê 
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part, but not all, of this money by working 
harder and earning a higher hourly rate, while 
getting less overtime, they did. Tn effect, they 
chose to have more leisure and less pay (it 
should be noted however that they only chose 
to do this when they got some pay for working 
harder). This indicates that the greater weight 
in the model on overtime as compared to 
average pay does in fact have some basis in 


the behavior of the men. 
All of these data seem to indicate that the 


model provides a fair description of the deci- 
sion making of the shipping group. This group 
probably does choose its level of productivity 
based on an economic decision in response to 
their incentive plan. In this case, the incentive 
plan seems to have worked as it was intended, 
with good results for both the work group and 
th ny. 

"This icis shows that when the psycholog- 
ical conditions are right, a work group will 
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respond to the payoff structure of a pay 
incentive system. In this case, the technical 
design of the system motivated workers to 
work harder than they had before the system 
was introduced. The success of the plan 
appeared to be limited by the negative values 
of effort, but these could not be avoided. 

It is important to note that the men prob- 
ably do not actually make their decisions in 
the way indicated by the model. They 
undoubtedly do consider all of the variables, 
but they almost certainly do not have the 
abstract ability to combine these in the way 
the model does. It is more likely that because 
they frequently face similar configurations of 
t, nc, D, and m, they have learned (due to 
the clear and obvious feedback implied by this 
system) which productivity level gives the 
maximum payoff, and this is the level at which 
they produce when faced with the situation 
again. The process of finding the “best” 


response may be random in nature; however, 
because of the frequency of the decision, over 
time the actual response approaches the 
optimal one, 


Overview of the Results 


Conclusions from this study about the 
nature of effective incentive systems should 
be treated with caution, since the data result 
from just one situation. Still, it is interesting 
that by considering just three factors (pay 
rate, effort costs, and leisure), productivity 
behavior could be predicted very well. This 
strongly suggests that under certain condi- 
tions workers respond to incentive systems in 
a simple, predictable way. Nowhere in the 
Sroup was there evidence of the complex 
gaming behavior that Whyte (1955) reports. 
This seems to be attributable to the general 
conditions that existed in the work group. Of 
Course, the data do not conclusively prove 
that these conditions were the reason that the 


ever, the failure of the j 
the shipping 
not met does 


appeared to be Operating there. In other sit- 
uations, 


and the results of 
ed. 


à positive psy- 


chological climate and the lack of negative 
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outcomes which could have been tied to pro- 
ductivity were not the result of chance. In 
both plants, the psychological conditions were 
the result of a great deal of work and effort 
on the part of the management and a number 
of social scientists (Marrow, Bowers, and 
Seashore, 1967). Their success with the 
Marion group indicates that a combination of 
social and technical engineering can be effec- 
tive and can have very favorable results for 
both the employees and the company. Their 
failure in the other shipping department indi- 
cates that joining social and technical engi- 
neering can be a difficult task and is likely to 
remain so until more knowledge has been 
gained about the way in which the two 
processes can be effectively combined. 
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This study examined the moderating effects of organizational independence. 
Survey data were collected from 211 quasi-professional employees in one firm 
and from 111 hourly employees in another. Many findings were contrary to 
the hypothesized relationships. Specifically, job scope and leader hierarchical 
influence were more positively related to satisfaction for employees who per- 
ceived organizational independence than for employees with no such percep- 


tions. Also, leader consideration and 


leader technical competence were more 


positively related to performance for employees perceiving organizational 


independence. 


House, Filley, and Gujarati (1971) re- 
ported that the relationships between several 
dimensions of a superior’s leadership behavior 
and his subordinates’ satisfaction were not sig- 
nificantly moderated by the degree of “hier- 
archical influence” imputed to the superior by 
his subordinates. In view of Wager’s (1965) 
finding that the greater the subordinate 
independence, the less the moderating effect 
of influence, the authors have suggested that 
subordinate independence may moderate the 
effects of other managerial and leadership 
practices. Subordinates who see themselves 
as being independent of the company, as hav- 
ing high labor mobility, and as not being 
tied to any particular organization for their 
work satisfaction, are said to have “orga- 
nizational independence,” regardless of the 
objective state of the labor market. j 

This orientation of organizational indepen- 
dence has been shown to be a major 
attitudinal characteristic differentiating Or- 
ganizational “cosmopolitans” from “locals” 
(Gouldner, 1957, 1958; Merton, 1957). Cos- 
mopolitans are organizational members who 
have a stronger commitment to their disci- 
pline and a weaker loyalty to their organiza- 
tions than locals (Blau & Scott, 1962; 
Etzioni, 1964; Gouldner, 1959; Kerr, 1972; 
1960; Whyte, 1970). Although 


Marcson, ; 1 
differences exist between organi- 


important 


1 Requests for reprints should be sent to Steven 
Kerr, College of Administrative Science, Ohio Slate 
University, 1773 South College Road, Columbus, 


Ohio 43210. 


zational independence and the more complex 
construct of cosmopolitanism, we would 
expect the attitudes and behavior of “organi- 
zational independents” and “organizational 
dependents” (subsequently referred to as 
“independents” and “dependents”) to be 
consistent with previously reported research 
on cosmopolitans and locals. 

Since independents are subjectively less 
dependent on their employing organization for 
their need satisfactions, it was assumed that 
they would be motivated more by their own 
internalized standards and would be satisfied 
by achieving them, whereas dependents would 
be motivated and satisfied more by organiza- 
tional stimuli and rewards, such as the 
behavior of superiors and the reward and 
authority structure of the organization. 
Furthermore, since independents are likely to 
be more highly educated, it was reasoned that 
they would expect and respond positively to 
practices that provided them with more job 
autonomy and task variety. 

Specifically, it was predicted that for inde- 
pendents there would be greater positive cor- 
relations between the following sets of predic- 
tor and criterion variables than for depen- 
dents: 

: Predictor variables 
Span of control of immediate superiors 
Subordinates’ psychological participation 
Job scope | 
Criterion variables 
Subordinate satisfaction with the organization 


Subordinate satisfaction with role expectations 
Subordinate performance 


= 
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It was also hypothesized that for independents 
there would be smaller correlations between 
the following predictors and criteria than for 
dependents: 


Predictor variables 
Leader consideration 
Leader initiating structure 
Leader technical competence 
Adherence to chain of command 
Leader hierarchical influence 


Criterion variables 
Subordinate satisfaction with the organization 
Subordinate satisfaction with role expectations 
Subordinate performance 


METHOD 


The present research consisted of tests of the 
hypotheses with one sample and of an attempt to 
replicate the findings from that sample in a different 
population, 

For the first sample, data were collected from the 
salaried full-time personnel at the New York, North 
Carolina, and major New Jersey locations of a 
medium-sized chemical and plastics manufacturing 
firm. The finance division did not agree to participate, 
and its 60 employees were therefore excluded from 
the sample. Of the 240 employees who were asked 
to complete questionnaires, 211 returned completed 
forms—an 89% response rate. The 29 individuals 
who did not respond included several who only 
completed a small portion of the questionnaire, 
Administration was in groups of from 5 to 8 persons, 
and anonymity was assured. 

To test the generality 
study was conducted in an 
population Possessing several characteristics different 
Er modi uen manufacturing employees of a 
eoe ol pes plant. located in upstate 
` : sked to participate. The 111 ques- 
tionnaires that were completed represented a 92%. 
response rate of those in attendance on the day the 


questionnaires were administered, As before. 
istration 


of the findings, a second 
other company based on a 


, admin- 


T ya - Small groups, employees were 
at the study was for resear 
A [y search purposes, 
and anonymity was guaranteed, ditio: 


At A = : 
a quesHonnares Were administered, 
rms Were asked to indica 

3 Po n as o indicate th 
ames of their immediate Superior and of two ^s 
who knew their Performance best rà 
each employee named in response 32 
visited at his work site a 
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Description of the Measures 


Leader consideration. Consideration scores were 
obtained from 15 questions in the Leader Behavior 
Description Questionnaire (Stogdill & Coons, 1957; 
Stogdill & Shartle, 1955). Items referred to whether 
the leader was perceived as being considerate of 
subordinate needs, as willing to explain his actions, 
and as being warm, supportive, and friendly. 

Leader initiating structure. The initiating structure 
scale consists of 15 questions contained in the Leader 
Behavior Description Questionnaire and measures to 
what extent subordinates perceive the leader as 
structuring the work environment by assigning par- 
ticular tasks, specifying procedures to be followed, 
and clarifying expectations. 

Leader hierarchical inf uence. This scale consists of 
7 questions, most of which were developed by 
Comrey, Pfiffner, and High (1954). It measures up- 
ward influence of the leader with respect to personnel 
management of his subordinates and to participation 
in policy decisions. 

Leader technical competence. This 6-item scale was 
also obtained primarily from questions developed by 
Comrey et al. (1954) and measures the degree to 
which the superior is perceived as capable of pro- 
viding advice on technical or specialized problems. 

Respondent's psychological participation. 


This 
scale consists of 6 items, most of which were 
developed by Vroom (1960). It measures the extent 
to which the individual feels that he influences joint 
decisions made with his superior. 

Organizational independence. The organizational 
independence scale reflects the applicability of the 
worker’s knowledge and experience gained on his 
present job to other firms and his perceived ability 
to leave his present job and obtain an equivalent one 
elsewhere. The items comprising this scale and those 
which make up the job scope and adherence to chain 
of command scales have been published elsewhere 
by Kerr, House, and Wigdor (1971), Data concernin® 
the validation of the organizational independence 
job scope, adherence to chain of command, and span 
of control measures have been presented by Wigdo 
(1969). 

Job scope. This 5-item scale measures the ext ps 
to which the subject performs a variety of tasks, pum 
projects through to completion, and determines y 
objectives and methods, 

Span of control. A single question was 
determine the number of individuals reporting ! 
subject's superior: *How many people report dir 
to your immediate superior?" item 

Adherence to the chain of command. This 4-1 ons 
scale measures the extent to which communica 
and task assignments follow the chain of DENIS, js 
and the number of superiors to whom the subjec 
accountable, Job 

Satisfaction of employee role expectations: E 
Description and Job Expectation questionnaires ig 
veloped by the Personnel Research Board of the pe 
State University (Stogdill, 1960) were utilized. res 
Job Description Questionnaire essentially meee 


ent 


asked t° 
to the 
ectly 
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employee attitudes toward the company and its 
management. The Job Expectation Questionnaire 
measures employee satisfaction with role expectations 
regarding (a) intrinsic elements of the work itself, 
(b) advancement, (c) the prestige of the respon- 
dent’s job as compared with jobs of others, (d) 
pay, (e) freedom, (f) family attitudes toward the 
respondent's job, and (g) job security. In the present 
study, advancement, pay, and prestige expectations 
items were combined to form a single measure— 
extrinsic job satisfaction. The 3 satisfaction measures 
employed in this study therefore consisted of intrinsic 
and extrinsic satisfaction as well as satisfaction with 
freedom, family attitudes, and security. 

Performance ratings. The job performance scale 
used was a multitrait-multirater scale. Lawler (1967) 
has indicated that “with this approach it is possible 
to assess the criterion by determining its convergent 
and discriminant va y, and it is not necessary to 
depend on an objective indicator such as profits or 
sales that may miss the essence of the job [p. 370].” 

Obviously, raters must be familiar with the aspects 
of the individual’s performance to be rated. Other- 
wise, the ratings are more likely to be affected by 
the halo tendency (Bescoe & Lawshe, 1959). Subjects 
were, therefore, asked to rate themselves (in the first 
sample) and to indicate, prior to receiving the per- 
formance questionnaire, the name of their superior (s) 
and the names of two peers who knew their perfor- 
mance best, The choice of which superior (if more 
than one existed) and of which peer rated the sub- 
ject was done randomly. 

The traits rated for each subject were quality, 
quantity, ability, ability without guidance, initiative, 
and effort. A 7-point scale—excellent, very good, 
good, average, fair, poor, or inadequate—was used 
for each rating. Jue 

Scale reliabilities and validities. Scale reliability 
data for these samples have been published else- 
where by Kerr et al. (1971). Overall, the only low 
reliabilities, computed by a Spearman-Brown proph- 
ecy formula correction for number of items of the 
Kuder-Richardson Formula 20, were for adherence 
to chain of command (.50 in Company 1 and .51 in 
Company 2) and for satisfaction with respect to 
security (.59 in Company 1 and .61 in Company 2). 

In general, performance validities for the first com- 
pany are similar to those of Lawler (1967) in that 
the self-ratings have low correlations with ratings 
by superiors or peers. Because of the stronger validity 
of the superior-peer ratings, averages of these ratings 
were used to test the hypotheses, and the self-ratings 
were dropped from the analysis. 

The performance measures were tested for con- 
vergent and discriminant validity according to the 
criteria suggested by Campbell and Fiske (1959), 
All six of the peer-superior coefficients met the test 
> at the .01 level of significance 


of convergent validit $ : 
in each of the companies studied. br l 

One of the tests of discriminant validity requires 
that the correlations between two ratings of a single 
trait should be greater than the correlation between 
any one rating of that trait and any other trait 
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rated by the same rater. All ratings in both studies 
failed this test. This is a very stringent test that, as 
Gunderson and Nelson (1966) and Lawler (1967) 
indicate, is seldom met by behavioral trait data 
More recently, Jackson (1969) argued that this 
test is unrealistic in that it ignores the strong tend- 
ency of monomethod ratings to be biased by method 
variance when the question of validity is primarily 
one of correlation of trait variance. 

The second test of discriminant validity requires 
that the validity coefficients be greater than the 
corresponding coefficients in the heterotrait-hetero- 
rater triangles. All peer-superior ratings in Company 
l met this test at the .055 level of significance or 
better. In Company 2, all peer-superior ratings 
except ability and quantity were significant at the 
.055 level or better. 

The third test of discriminant validity, computed 
by ordering the coefficients in the heterorater and 
monorater triangles and computing the W coefficient 
of concordance, was met at the .001 significance level 
in Company 1 and at the .05 level in Company 2. 

Method of analysis, Spearman rank correlations 
were computed to determine the degree of relation- 
ships between predictor and criterion variables. 

To test the moderating effects of organizational 
independence, the mean scale score was computed, 
and all subjects in the one third of the sample 
scoring closest to the mean were assigned to the 
medium independence group. All subjects scoring 
higher than the range of scores comprising the middle 
group were igned to the high-independence group, 
while all subjects scoring lower were assigned to the 
low-independence group. Separate correlations were 
then computed for relationships within each group. 

Since in this study the predictor variables were not 
independent of each other, the effects of the mod- 
erator were analyzed in terms of monotonically 
increasing correlations with systematic changes oí 
the moderating variables rather than by multiple 


significance of difference tests. ? 


RESULTS 


Company 1. All correlations that change 
monotonically with increases or decreases in 
the moderator are summarized in Table 1. 
Findings from the first firm not only failed to 
confirm the hypotheses, but suggested 
opposite relationships of many of the vari- 
ables. i 

7 An alternate method of analysis would have been 
to derive independent predictors on the basis of a 
factor analysis of the predictor scale. This alternative 
was not chosen because to have done so would have 
resulted in the loss of specific variance in each theo- 
retical predictor and thus would have prevented 
interpretation of the results in light of the general 
proposition from which the hypotheses were derived. 
This alternative would also have prevented com- 
parison of the present findings with previous studies 
using the same scales. 
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Table 1—(Continued) 


Predictor and criterion variable 


Range of rs which increase (decrease) monotonically 
with increases in independence 


Firm 2 


Extrinsic satisfaction 
Family attention | 
Security 
Ability | 
Initiative | 
Quantity 
Quality 
Effort | 
Ability without guidance | 
Adherence to chain of command | 
Company satisfaction | 
Intrinsic satisfaction | 
Extrinsic satisfaction | 
Security | 
Freedom 
Quality 


Firm 1 | 


peer 


10 


rom low to m 
cant at the .10 
cant at the .05] 
gnificant at the .01 le 


From Table 1 it can be seen that many 
correlations between predictor and satisfac- 
tion variables increased monotonically with 
increases in organizational independence. Cor- 
relations between each predictor and at least 
] measure of satisfaction and correlations 
between company satisfaction and 6 of the 8 
predictors increased monotonically as organi- 
zational independence increased. This pattern 
was consistent and pronounced. 

The performance variables also revealed 
different patterns for dependents and indepen- 
dents, although differences in correlates of 
performance were less pervasive than those of 
satisfaction. For example, correlations. be- 
tween effort and leader consideration, initiat- 
ing structure, technical competence, and job 
scope increased monotonically as the samples 
scored higher on the independence scale. 
Correlations between ability without guidance 
and leader hierarchical influence and technical 
competence also increased monotonically with 
increases in independence, as did correlations 
between initiative and leader consideration 
and initiating structure. None of the other 
performance variables changed monotonically 
with independence for more than 1 predictor. 


n to high. Only low and high coefficients are reported here. 


The moderator had a positive effect on at 
least 1 performance variable of leader tech- 
nical competence, job scope, and the leader 
behavior variables consideration, initiating 
structure, and hierarchical influence. The 
moderator had no effect on the performance 
variables of adherence to chain of command 
and had a negative effect on correlations be- 
tween leader span of control and 3 perfor- 
mance measures and between respondent 
psychological participation and 3 performance 
measures. However, only for the relationship 
between psychological participation and abil- 
ity without guidance was the correlation even 
moderately high for dependents (r= .38) 
and not for independents. j 7 
Company 2. The above findings were 
contrary to expectations, and suggested that 
the satisfaction and performance of indepen- 
dents is more closely associated with leader 
behavior than is the satisfaction and perfor- 
mance of dependents. If this finding has 
generality across different kinds of popula- 
tions, it has significant implications for orga- 
nization theory and management practice. 
The second study employed the same mea- 
sures as the first, except that the initiating 


———— =< i 
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TABLE 2 
SIGNIFICANT REPLICATED CORRELATIONS BETWEEN 
PREDICTORS AND CRITERIA FOR ORG. ZATIONAL. 


DEPENDENTS AND INDEPEND 


Study 


Predicte 


and criterion 


Dependent Independent 


variable 


1 2 1 2 
36 | .so | 
26 | .38 
| 
atisfaction with freedom 
ffort | | 
T competence | 
Company satisfaction | | A3 | SI 
Quality Al} 38 
Effort | | 3 | M 
Initiative | 30, 30 
Ability without guidance | 38 En 
Quantity | 35 | [a0 
Psychological participa | 
Company sati | | .26 | .31 
Intrinsic sa 40 5] | 
faction freedom 52 39 45 | 29 
Xtrinsie satisfaction 28 | .37 
Job scope | i 
Effort | | | 


structure scale was omitted due to a collating 
error. Table 1 shows that 9 relationships 
between predictors and criteria changed 
monotonically in the same direction as in the 
first firm, while 4 relationships changed mono- 
tonically in the opposite direction, Tt is also 
apparent that the relationships between pre- 
dictor and satisfaction. variables were not 
moderated as strongly by organizational inde- 
pendence as in Company 1. Thus, so far as 
Satisfaction is concerned, the moderator effect 
seemed to be restricted to the first Population 
studied and could not be interpreted as sug- 
gestive of a genera] phenomenon, 

Several of th 
performance we 


= ] Increased in organi- 
n general, 


Í significant Correlations, In 


addition to determining whether the mod 
era- 
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tor effect was replicated, we also sought to 
uncover correlations that were of similar 
magnitude in both studies. To do this it was 
necessary to choose some criterion for accept- 
ing a correlation as approximately equal in 
both firms. Although use of the .05 signifi- 
cance level as the criterion violated the 
assumption of independence of the predictor 
variables, such a violation did not seem 
crucial, since we were interested in determin- 
ing whether a common profile existed and not 
whether any single correlation could be 
reliably asserted as significant, 

Table 2 presents listings of those correla- 
tions significant for dependents and indepen- 
dents in both firms. It can be seen that for 
dependents, only two predictors had any 
common associations across both companies. 
These were consideration and psychological 
participation. Interestingly, both of these are 
characteristic of Theory Y (supportive) 
leadership orientations, and both were related 
to satisfaction but not to performance. A 

For the independents, 5 predictors repli- 
cated—psychological participation, consider- 
ation, hierarchical influence, technical com- 
petence, and job scope. Viewed collectively, 
the leadership characteristics are those of 
“great man” leadership practices (Bales, 
Borgatta, & Couch, 1954); that is, a leader 
simultaneously high on consideration, hier- 
archical influence, and technical competence 
would resemble one described by the great 
man theory of leadership. All of these predic- 
tors except psychological participation cor- 
related with subordinate effort, and all except 
job scope correlated with company satisfac- 
tion. 


Discussion 


These findings indicate that the organi- 
zational independence scale has a positive 
moderating effect on the relationships be- 
tween leader behavior, job scope, and a 
ordinate satisfaction and performance in boo 
companies, The correlations generally a 
creased systematically with increases a 
organizational independence. Only one pn 
tionship was found to decrease in ju 
Samples as organizational independence pt 
creased, and that relationship was so 10W ™ 
to be negligible. 
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Several possible explanations for these find- 
ings might be considered. First, the moderator 
effects could have been due to differences in 
variance in the subsamples (the moderated 
groups). However, analysis of means and 
standard deviations for all subsample vari- 
ables revealed no systematic differences in 
variance across groups that could account 
for the monotonic increases in correlations 
with increased organizational independence. 

Second, one of the predictors might have 
covaried with organizational independence, in 
which case job characteristics would account 
for the moderating effect of organizational 
independence. To test this possibility, organi- 
zational independence was correlated with all 
predictor and criteria variables and found to 
correlate significantly with 14 of them. Of 
these, only 3 were significant in both firms, 
and none was sufficiently high in both com- 
panies to support this possibility. 

A third concern was that respondent scores 
on the organizational independence scale were 
a reflection of psychological needs or task 
characteristics. It was possible to examine 
this possibility because two other scales were 
administered to test hypotheses not reported 
in this paper. The scales, respondent need for 
independence (Vroom, 1960) and respondent 
task independence (independence from others 
in performing his task) were insignificantly 
correlated with the organizational indepen- 
dence scale across both firms. The possible 
explanation that respondents’ high degree of 
organizational independence was due to 
psychological needs for independence or task 
characteristics was therefore not supported. 

Finally, it is possible that respondents high 
in organizational independence had higher 
expectations for competent, employee-oriented 
management practices, while those scoring 
low on the scale were more content with what- 
ever competence level or style of management 
practices they received. This possibility can- 
not be adequately tested with our data. 
However, such an interpretation is consistent 
with theories of attitude and attitude change 
which argue that man has a need for 
cognitive, emotional, and behavioral balance 
or consonance. Cognitive dissonance theory 


suggests that, to the extent that people per- 
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ceive no real opportunity to change an un- 
pleasant situation, they will tend to alter their 
felt displeasure toward that situation. In some 
cases they may go so far as to convince 
themselves that they are really happy with 
the status quo. The possibility therefore exists 
that the independents in these samples are 
more keenly aware of any shortcomings in 
leader behavior or organizational practices 
that may exist. Such people do not find it 
necessary to reduce dissonance by convincing 
themselves that they like the situation, 
because they perceive themselves able to 
change (i.e., leave) the situation. 

Those respondents who scored low in orga- 
nizational independence would be expected to 
act differently. Instead of allowing such 
independent variables as consideration and 
technical competence of superior to directly 
affect satisfaction and performance, such 
people would be predicted by dissonance 
theory to insensitize themselves to such mat- 
ters, reducing dissonance by responding (and 
believing) that they in fact are satisfied by 
inconsiderate and technically incompetent 
leaders. 


CONCLUSIONS 


The major conclusion of this research is 
that, contrary to expectations, organizational 
members perceiving themselves as possessing 
independence from the organization were 
highly sensitive to managerial practices within 
their organizations. Although correlational 
data are insufficient to draw causal inferences, 
this finding appears to warrant either longi- 
tudinal or experimental tests, 

Although findings from two samples are 
hardly sufficient to claim generality, the dis- 
similar geographic, demographic, 'and task 
characteristics of the samples suggests that 
those findings that were replicated might well 
represent a general phenomenon. Specifically, 
these findings were that job scope and hier- 
archical influence were more positively corre- 
lated with satisfaction, and leader consider- 
ation and technical competence were more 
positively related to ratings of performance, 
for independents than for dependents. 

The major theoretical significance of these 
findings concerns the usefulness of the organi- 
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zational dependence-independence construct 
for classifying organizational members. If 
these findings are supported by future 
research, this classification should improve 
the prediction and explanation of behavior in 
complex organizations. 

The major practical significance of the 
findings concerns the usefulness of the depen- 
dence-independence construct for selecting 
leader styles and organizational practices 
appropriate for employees with different 
perceptions. In a society that is becoming 
increasingly mobile, it is probable that a 
greater proportion of tomorrow’s employees 
will perceive themselves as organizationally 
independent and will therefore have to be 
managed differently from dependents. The 
findings reported here suggest some appro- 
priate strategies for satisfying and stimulating 
higher performance from both dependents 
and independents. 
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AN EMPIRICAL TEST OF FACTOR INVARIANCE? 
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This study compares the results of independent hierarchical factor analyses 
of two administrations of the same employee attitude questionnaire in the 
same company to similar employee populations that were 10 years apart in 
time. The factors were compared using Tucker’s method of factor comparison, 


A general factor, 2 


third-order, 4 second-order, and 17 first-order factors 


emerged in each analysis. The factor structure showed a remarkable stability 
with only 3 of the Tucker coefficients below .80 for the first-order factors. The 
implications for employee attitude measurement are discussed. 


One of the tasks frequently encountered by 
the personnel psychologist is the assessment 
of employee attitudes. While there are a va- 


riety of approaches to assessing employee, 


attitudes, the questionnaire or survey ap- 
proach is the one most frequently employed. 
The construction of such questionnaires often 
utilizes the factor-analytic approach to de- 
velop scales to measure the relevant aspects 
of the job environment. ; 

While few studies are reported concerning 
the stability, over time, of these empirically 
derived structures of employee attitudes, 
many surveys are repeated to determine 
change in employee attitudes. In such sit- 
uations, the investigator assumes either im- 
plicitly or explicitly the concept of factor in- 
variance (Thurstone, 1947), that is to say, 
he assumes that changes observed are in the 
employees’ attitudes and not in the dimen- 
sions measured by different items in the ques- 
tionnaires. The present study describes an 
empirical test of the factor invariance of 
such employee attitude measures by compar- 
ing the results of independent factor analyses 
of the same questionnaire administered to 
similar employee populations in the same 
company 10 years apart in time. 


iThis article is based, in part, on the a 
author's master’s of business administration thesis 


at the Ohio State University. i 
2 Requests for reprints should be sent to Darrell E. 


Roach, Nationwide Insurance, 246 North High Street, 
Columbus, Ohio 43216. 


METHOD 
Procedure 


The procedure for administering the questionnaires 
was identical in both surveys. Questionnaires with 
return envelopes addressed to the vice-president of 
personnel were distributed to all full-time employees. 
The employee had the option of completing the ques- 
tionnaire on his own or on company time and mailing 
it via interoffice mail or outside the office. All ques- 
tionnaires were anonymous. The response rate was 
82% in 1956 and 70% in 1966, yielding sample sizes 
of 4,052 for the 1956 and 4,882 for the 1966 survey. 

The samples in both years were heterogeneous, 
including clerical employees, supervisors, and tech- 
nical and management personnel who had vary- 
ing lengths of service and who worked in 14 different 
locations scattered throughout the eastern part of 
the United States. It is estimated that approximately 
one third of the 1966 sample also participated in 
the 1956 survey. 

The questionnaire was developed by the company’s 
research staff, and items were selected on the basis 
of a factor analysis of an earlier questionnaire 
(Roach, 1958). The questionnaire consisted of 98 
items that were answered by Checking 1 of 5 response 
categories. The extreme and middle Categories were 
labeled as very satisfied, fairly satisfied, and very 
dissatisfied. 

There were also 5 classification items dealing with 
length of service, salary grade, type of position, 
office or field location, and career intentions. 

The intercorrelations of the 98 attitudinal items 
and the 5 classification items were calculated for 
cach of the surveys. The resultant correlation ma- 
trices were factor analyzed. 

The factor-analytic Procedure utilized two com- 
puter programs. The first program extracted the 
factors using the centroid method. The second pro- 
gram was designed to extract higher order factors. 
This latter Program ranks the intercorrelations be- 
tween the variables by column, then calculates rank- 
order correlations. These correlations are then ranked, 
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TABLE 1 


FIRST-ORDER Factors WITH SELECTED ITEM LoapixGs: OkIGI 


Factor 


1. Physical working conditions — — 
Maintenance of office and equipment 
Physical factors of job 
Equipment 
Lunch facilities 
2. Justice and interest of management 
Management's fairness in treatment of employees 
Cooperative philosophy 
Management's interest in welfare of employees 
Management's awareness of employee's problems 
3. Pride in company 
Pride in product 
Feeling about company in general 
Company reputation in the community 
4. Intrinsic job satisfaction 
Kind of work done 
Interest in the work 
Feeling of doing something worthwhile 
. Co-workers 
Friendliness of co-workers 
How well co-workers get along 
Co-workers as friends 
6. Immediate supervision 
Supervisor's consideration for people in section 
Supervisor’s understanding and sympathy for employee problems 
Supervisor’s abilities and knowledge of jobs supervised 
Supervisor’s knowledge of what is going on 
. Job security 
Discharges without cause 
Security in job 
Administration of absence policy 
. Freedom from work rules 
Lunch facilities 
Rest periods 
Absence policy 
Setting up and enforcing job standards 
Knowing what is expected 
"Training given for present job 
Checking on work 
- Confidence in ability of management 
Officials ability to build good organization 
Official’s ability to solve company problems 
Official's ability to plan for future expansion 
Leadership provided by company officials 
11. Downward communications 
Reasons for chan 


an 


E 


ges in policy or operations 


Tnformation about company finances and progress 
Availability of information 


12. Employee benefits 
Company benefits program 
Health insurance program 
Paid vacation program 

Infrequent benefits 
Annual picnic 
Christmas gift 
Medical services 


Employee publications 


13, 


| 


Original study | 


RIO 


(.94)* 


| 
| 


^ 
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‘Table 1—(Continued) 


Factor 


P 
| Original study | Later study 
| 


14. Opportunities for relief from boredom 
Hours of work 
Rest periods 
Activities program 
Paid vacation program 

15. Job demands 
Pressure on the job 
Time allowed for job 
Amount of work expected 
Frequency of rush jobs 

16. Development and advancement 
Opportunities to learn 
"Training given for better jobs 
Opportunity compared to other companies 
Availability of good jobs 

17. Salary 
Salary compared to others in division 
Division salaries compared to other divisions 
How well paid and rewarded for work 
Classification into job grades 


| (.54) 

| ES ! 56 
32 | 50 
ERI | Eri 

| .08 | 22 

| (.92) 

| -64 | .61 
60 | ‘55 
ES | E 
EI | 52 

| (.89) 
AS | A6 
46 | AA 

| E | .80 

| 42 56 

| (.91) 

| 32 E 
52 | 54 

| 1 f .58 

| AS | 53 


Note. A complete listing of the items 
a The coefficients in parentheses are T! 
© "This factor did not emerge in the later study. 


and the procedure is iterated until the correlations 
approach Æ 1.00. The groups of positive correlations 
serve to identify variable clusters. The intercorrela- 


tions and factoring of these clusters were performed 


by Thurstone’s multigroup method, and the final 
hierarchical solution was obtained using the method 
described by Wherry (1959). The factors obtained 
in the 1956 and 1966 surveys were compared using 
Tucker's method of factor comparison (Tucker, 


1951). 
RESULTS 

A total of 17 first-order factors emerged 
from the centroid analysis. The clustering 
technique and the analysis of the first-order 
factor matrix yielded a general factor, 2 
third-order factors, and 4 second-order fac- 
tors, The number of factors extracted was 
identical in both analyses and the number of 
factors at each level in the hierarchy was the 
same; although a new second-order factor 
emerged in the analysis of the 1966 data, a 
first-order factor from the 1956 data failed to 
emerge in the 1966 analysis, and a second- 
order factor from 1956 became a first-order 
factor in 1966. 

Table 1 shows the factor names, 
and loadings on defining items for 
rdings for several 


Tucker 


coefficients, 
the two studies. The item wo 


and factor loadings may be obtained by writing to the first author. 
ucker coefficients of agreement between the two analyses. 


of the items are abbreviated for the purpose 
of conciseness. 

These first-order factors are very similar to 
those found by previous investigators in fac- 
tor-analytic studies of job attitudes (Ash, 
1954; Baehr, 1954; Harrison, 1961; Roach, 
1958; Twery, Schmid, & Wrigley, 1958; 
Wherry, 1954). 

The factor pattern and the item loadings 
are quite similar between the two studies. 
Ten of the 16 Tucker coefficients were above 
.90 and 4 were above .80. Inspection of the 
loadings of the individual items indicates that 
with few exceptions they are of approxi- 
mately the same magnitude in the two studies. 

The analysis of the intercorrelations be- 
tween the first-order factors resulted in the 
extraction of 4 second-order factors. These 
were identified as corporate-provided need 
fulfillment (1966 only), need fulfillment from 
immediate work environment (.90), imper- 
sonal future rewards (.82), and immediate 
personal rewards (.63). The coefficients in 
parentheses are the Tucker coefficients. 

These 4 second-order factors contributed to 
the emergence of 2 third-order factors which 
in turn produced a general factor. The third- 
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1 
l1. corporate | 
| 


| Provided Need 
IE 


| 


Conditions 


6. Sunervision 


7. 


Job Security 


8. Frecdon from 
Work Rules 


9. Setting un 
and enforcing 
job standards 


TII. Inpersi 
Future 


10. Confidence 
in "anagement 


T. Development 
and Advance" 
nent 


Employee 


13 


benefits 


14. Relief from 


loredon 


13. Infrequent 
Aenefíts 


Fic. 1, Hierarchical factor 
indicate shifts in structure fro 


order factors were identified as personalized 
need fulfillment (.94) and corporate reward 
system (.93). The general factor had loadings 
above .20 on all but 4 items and the Tucker 
coefficient of agreement between the two 
analyses was .99. Figure 1 shows the hier- 
archical factor structure for the two analyses. 
The dotted lines indicate shifts in the factor 
organization between the two studies, 


Discussion 


This study demonstrates a remarkable sta- 
bility for an empirically derived factor struc- 
ture over a rather lengthy time period and 
for the items that were diagnostic of the struc- 
ture, Only 3 of the Tucker coefficients of 
agreement between the two analyses were be- 
low .80. This stability is emphasized even 
further when the loadings of the individual 
items are compared, Only rarely does one find 
a major shift in the loadings. The vast major- 
ity are within .05—10 in the two analyses, 
These results suggest that psychometricians 
can use repeat administration of such instru- 


Structure of employee attitudes 1956 and 1966. (The dotted lines 
m 1956 to 1966; *this factor did not emerge in the 1966 study.) 


ments and feel confident that observed bw 
in intensity and/or direction of the a 
assessed are legitimate and are not geci 
because of shifts in the factor structure. I e 
considers the item factor loadings as @ a 
sure of intrinsic validity, the results s158 
that the item validities are quite stable. -— 
The hierarchical structure obtained in th 
analyses implies an organization © (his 
employee attitude system at least within ue’ 
particular company. The general factor on 
gests a general orientation toward the cific 
pany which could serve to determine a SP s 
attitude by providing a dominant fram 
reference, especially in areas WEBS Eid 
employee may have little direct ei 
such as ability of management, opportu 
for promotion, etc. cub” 
This general orientation appears to es U 
divided into two somewhat less general ü de 
tations. These are the employee’s at or te 
toward what he perceives as the pu É of 
reward system, which relates to the ard 
company management and policies in T' 


> 
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ing him for his efforts on the job, and the 
extent to which the individual perceives his 
personal needs being fulfilled. 

The corporate reward system tends to split 
into attitudes towards more immediate and 
personal rewards contrasted with more future- 
oriented and shared rewards. 

The personalized need fulfillment appears 
to separate into need fulfillment that is 
obtained from the immediate work environ- 
ment, such as the job itself, co-workers, 
supervision, job freedom, etc., and corporate- 
provided need fulfillment that is obtained 
from the general organization environment, 
such as a fair and interested management and 
an organization with which they can identify 
and in which they can have pride. 

This organization of attitudes does not 
appear to be as stable as the individual fac- 
tors. As Figure 1 illustrates, some of the pri- 
mary factors became associated with different 


higher order factors in the second study. 


This suggests that while the basic attitud- 
be quite stable, 


inal dimensions appear to u 
their organization into an attitudinal system 


is more susceptible to change. 
SUMMARY AND CONCLUSIONS 


This study reports the results of two inde- 
ctor analyses of an employee atti- 
tered in the same com- 
t. The factor structure 


pendent fa 
tude survey adminis 
pany 10 years apar 


shows a remarkable stability across the two 


analyses. 

The study provides empirical evidence that 
the commonly held assumption that the same 
survey and factor structure can be used over 
a period of time to identify changes in 
employee attitudes is quite tenable. At least 
this is true for the general and primary 
factors identified. 
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PROBLEMS OF ORGANIZATIONAL CONTROL IN 
MICROCOSM: 
‘CE AN iR SATISFACTION AS A 
sROUP PERFORMANCE AND GROUP MEMBER SATI | 
OROL FUNCTION OF DIFFERENCES IN CONTROL STRUCTURE 


EDWARD L. LEVINE 
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Two aspects of organizational control structure—amount of control possessed 
by all organizational members and distribution of this control—were studied in 
64 three-person groups of undergraduate students, using role-playing techniques. 
The following hypotheses were tested and generally found significant at or 
beyond the .05 level: (a) The higher the total amount of control of group 
members over decision making and the more equally the members share this 
control, the better the groups’ problem-solving performance and the higher the 
members’ satisfaction. (b) These effects may be accounted for, in part, by 
more positive socioemotional interactions within those groups with higher 
amounts and more equal distributions of control. 


In traditional bureaucratic forms of organi- 
zation, some goals of the organization and 
those of its members have been seen as in- 
compatible (see Bennis, 1959). Since this sit- 
uation would appear to limit the commitment 
and motivation of organization members, new 
models (e.g. Likert, 1961) for organizations 
have been developed. 

One line of research (‘Tannenbaum, 1956, 
1962a) has been concerned with various as- 
pects of an organization’s control structure 
and how these relate to member satisfaction 
and performance. Thus, distributions of con- 
trol at various hierarchical levels, the total 
amount of control in the organization, and 
the relationship of control structure to the 
personality of organization members have been 
studied. 

The major conclusions from this work have 
been that more effective organizations (e.g.. 


+ This report is based on a portion of the author’s 
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ones in which overall performance and satis- 
faction are high) result when the total amount 
of control exercised by all hierarchical levels 
is high. Some have interpreted such power- 
equalization approaches to mean that more 
power must be given to subordinates at the 
expense of management (Bass, 1967; Emery, 
1959). However, the distribution of control, 
which is theoretically independent of the total 
amount of control, does not seem as crucial 
(Tannenbaum, 1962a). In a system of high 
mutual influence, with supportive, employee- 
centered leadership, the boss need not cede his 
power to his subordinates; rather, they 1 
turn are included in the decision process. 
According to Tannenbaum (1961), organiza- 
tions that have high total control at all levels 
(called by him polyarchic organizations) are 
"best" in general, while organizations that 
have control more concentrated at lower hier- 
archical levels may be effective in voluntary 
associations, such as the League of Women 
Voters. Furthermore, this model of organiza- 
tional functioning is seen as relevant mainly 
for organizations in which the goals of all sub- 
groups are fairly congruent. Zald (1962), for 
example, has shown that high total control 
may be dysfunctional when there are con- 
flicting goals among subgroups within the or- 
ganization. : 

One limitation of the research on which this 
fheory is based is the methodology thus fa! 
employed, which depends on the perceptions 
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of organizational members as measured by 
survey questionnaire (Tannenbaum, 1956; 
Tannenbaum & Kahn, 1957). Halo, social 
desirability, self-esteem, and error of mea- 
surement can be serious in such methodology, 
especially when one-item measures are used. 
Furthermore, distribution of control in these 
studies represents amount of control at one 
level relative to amount of control at another 
level and is measured by some form of differ- 
ence score. Since difference scores tend to be 
unreliable, the theoretical statements based on 
such findings may be subject to question. 
Other difficulties with conclusions derived 
from this work on organizational control struc- 
ture include (a) variable frames of reference 
from one respondent to another and from one 
hierarchical level to another (T: annenbaum & 
Kahn, 1957); (5) inequality of intervals 
between hierarchical levels and between 
amounts of control, which has an unknown 
effect on the data (Smith & Tannenbaum, 
1963); (c) difficulty of separating individual 
phenomenological effects from actual struc- 
tural effects (Sills, 1957: Tannenbaum & 
Bachman, 1964); (d) inability to define the 
direction of causality, since a correlational 
paradigm is used almost exclusively (cf. Far- 
ris, 1969); (e) lack of distinctions among 
several aspects of control, such as decisions 
about pay, task structure, working conditions, 
etc, (cf. French, Israel, & As, 1960) and 
lack of distinctions among the means used to 
exercise influence (Smith & Tannenbaum, 
1963); (f) lack of clarity in the relationship 
between actual leadership practices, task 
structure, and the amount of total control 
(Bachman, 1968); (g) slight differences from 
study to study in the wording of response 
measures, which limits the comparability of 
results (Smith & Tannenbaum, 1963); (4) 
conceptualization and operationaliza- 
he concept of control, which may not 
be directly comparable with other theoretical 
and empirical treatments; and (i) lack of 
specification of the functional enana 
of structural differences, that is, through what 
organizational behaviors does a structural 
difference operate to produce better or poorer 
organizational performance and member satis- 


faction. 
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Furthermore, interactions between the two 
control structure variables—amount and dis- 
tribution—have never been well explicated. 
Tannenbaum (1961) has demonstrated an 
interaction between total control and distribu- 
tion of control in a voluntary type organiza- 
tion. In that study the “best” local League of 
Women Voters had more control distributed 
to lower levels and had a greater amount of 
total control. Also, it may be that distribution 
of control has more of an impact when amount 
of total control is low. However, the possible 
interaction between these two structure varia- 
bles (amount and distribution of control) has 
not been followed up to any great extent in 
later research. 

The present work attempts to deal with 
several of these issues by providing an exten- 
sion of the model and a translation of it into 
a more general framework and by conducting 
a laboratory experiment to test several com- 
ponents of the model. 


Theoretical Development 


Figure 1 diagrams some theoretical relations 
of control structure variables to performance 
and satisfaction. This, essentially, translates 
Tannenbaum’s theory into the general frame- 
work proposed by Katzell ° and modifies it by 
adding some related work (Cummins, 1967; 
Tannenbaum, 1957; Tannenbaum, 1962b; 
Tannenbaum & Allport, 1956; Williams, 
1965) on relations with personality structure 
and size. This reformation would appear to 
extend Tannenbaum's position, in providing 
for a more.clarified specification of the func- 
tional concomitants of control structure varia- 
bles. 

The model suggests, for example, that 
where the members have personalities con- 
ducive to high total control (e.g., possibly 
high in risk taking and need for achievement), 
the structure is polyarchic (or equalitarian in 
voluntary organizations), the organization is 
small, and the goals of subunits are not in 
conflict; there should be increased commit- 
ment of members (Likert, 1961; Tannenbaum, 
1962a), increased striving for goals, better 
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Fic. 1. 


The theoretical relationship of control structure to performance 


and satisfaction. 


communications (Smith & Brown, 1964), and 
less intraorganizational conflict (Smith, 1966) 
—all of which will cause maximal performance 
and satisfaction. 

The model provides for feedback loops so 
that structure can change personality (Tan- 
nenbaum, 1957), the amount of conflict can 
affect structure (Smith, 1966), and the level 
of performance can affect structural and func- 


tional variables ( Farris, 1969; Tannenbaum, 
1962a). 


Hypotheses 


Based on the theoretical model outlined, the 
following selected hypotheses were tested ex- 
perimentally in a small group setting: 

1. The higher the amount of total control, 
e greater the performance and the higher 
e satisfaction a group’s members will have. 
2. A balanced distribution of control (one 
in which all members share influence equally) 
will lead to better performance and higher 
satisfaction than will an unbalanced distribu- 
tion of control. 

3. When task conditions are more difficult, 
the effects of having a balanced distribution 
of control and a higher amount of total con- 
trol will be even Sreater; that is, there should 


be an interaction between task difficulty and 
control structure, 


4. Since total control ha: 


th 
th 


output when total control j 
is high. This means, in e 
tion of control makes a di 
when total control is low, 


5. In terms of the functional concomitants 
of control, the high-total-control and bal- 
anced distribution conditions will result in 
less conflict, more striving for goals, and 
greater ease of communication flow than for 
their opposite conditions of control, all of 
which are represented by the behaviors soli- 
darity, tension, agreement, disagreement, and 
aggression in this study, 


METHOD 


The method chosen is a laboratory experiment, 
which sought both to treat these problems in micro- 
cosm and to eliminate problems of subject deception 
often inherent in such laboratory studies (sce bs 
cisms of such studies by Argyris, 1968) through role- 
playing techniques. 


Subjects 


Subjects consisted of 192 undergraduates enrolled 
in a basic Psychology course. Each group consiste 
of three males or three females randomly ES 
roles to play and chairs to occupy. Sets of roles sinet 
responding to the various control structure conditio 
were randomly chosen, -— 

The eight conditions (two levels of amount of co 
trol, two levels of distribution of control, and A 
levels of task difficulty) each contained eight EEOUPT. 
four male groups and four female groups, for ? 
total N of 64 groups and 192 subjects. 


Procedure 


The groups of three subjects seated around a r£ 
lable were told the following: they were part pe 
research team; they would not be fooled in any Beate 
the experiment would be fully explained Jater; tions 
were to play a role and would be given e des 
Separately; and the success of the experimen 
pended on how well they played their roles. gen 

Roles in the set assigned to each group were oe 
randomly to the subjects who were allowed a ios 
moments to read through them carefully in DES 
Then the experimenter went over the role with bjec 
Subject and explained any difficulties the su 
might have. 
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_ Each group's session was videotaped for later cod- 
ing by means of Bales’ Interaction Process Analysis 
(Bales, 1950) for the dual purposes of determining 
whether, in fact, the individuals were performing 
their assigned roles and whether the hypothesized 
links between group process, the control structure 
inputs, and group outcomes were present. The video- 
tapes were coded by the experimenter three separate 
times, once for cach individual. 

Since the experimenter was compelled to code the 
tapes himself, an independent observer without 
knowledge of the experimental conditions was em- 
ployed to check the objectivity of the experimenter's 
codes. This check involved having both the experi- 
menter and the independent observer code, by indi- 
viduals, 6-minute segments of each group session—2 
minutes early in the session, 2 minutes midway 
through, and 2 minutes late in the session. 

Subjects were informed that they would be given 
feedback on how well they were doing early in the 
group session. This feedback came after a prelimi- 
nary practice session and again after the first of two 
experimental games was completed. Subjects were 
reassembled, and a practice game, followed by two 
experimental games, ensued. Finally, a full debriefing 
session was conducted. 


The Control Structure Manipulations 


Based on prior research (e.g, Borgatta & Bales, 
1956; Katzell, 1969), the relevant behaviors for 
subjects to perform when they are to be programmed 
as “high-control” group members include, primarily, 
a high rate of interaction, a high amount of gives 
suggestion, a low amount of tension release, and a 
low amount of asks for information, opinion, and 
suggestion—though the emotional category is deem- 
phasized to simplify role instructions. The low- 
control" roles were structured in the opposite direc- 
tion—a Jow rate of interaction, a low amount of gives 
suggestion, a high amount of “asks” behaviors, and 


a high amount of tension release. died 
It should be noted that all subjects were exhortec 


in both written and oral instructions to do their 
best to solve the experimental problems. . 

The instruction not to participate actively given to 
low-control individuals did not preclude their adopt- 
ing roles such as being the recorder or their working 


on the problems silently. ^ 
The four major analysis of variance (ANOVA) 


cells of interest are represented by the following sets 
of prescribed roles (for a three-member group). 
Cell 1—high amount of control, balanced distribu- 
tion (2 = 16 groups) . 

Three high-control role players in each group 
Cell 2—low amount of control, balanced distribu- 
tion (n= 16 groups) 

Three Jow-control role playe 
Cell 3—high amount of control, 

tribution (n= 16 groups) 
Two high-control role players 
trol role player in each group 


rs in each group 
unbalanced dis- 


and one Jow-con- 


189 


Cell 4—low amount of control, unbalanced distri- 
bution (z = 16 groups) 
Two low-control role players, and one high- 
control role plaver in each group. 


Task and Measures 


The task chosen for this experiment is a modified 
version of a game used by Hollander (1960). The 
sk is centered around a 7 X 7 matrix in which 
columns are designated by color names, rows by let- 
ters, The elements are payofís ranging from —12 to 
+15. Group members must attempt to maximize 
their payoffs over two games of 15 trials each by 
picking rows (one per trial) which, when paired with 
the predetermined column for that trial, will yield 
high plus values. The experimenter has preselected all 
the columns according to a numerically based sys- 
tem, for example, three consecutive columns—skip 
one, in a forward progression. Thus if the group can 
guess the system, they can predict the column to be 
announced on a trial and choose the row which will 
intersect at the highest payoff element in the col- 
umn. (A group first announces their row choice after 
a period of discussion and decision; then the experi- 
menter announces the column.) Guessing the system 
also allows the group to complete the game quickly. 
See Levine and Weitz (1971) for a fuller explica- 
tion. The two output variables used were "time to 
completion" of the games played and the group's 
“payoff score." Two levels of task difficulty, labeled 
"easy" and “difficult,” were administered. 

A postexperimental questionnaire contained, among 
others, measures of group satisfaction (Katzell, 1969) 
and control possessed by self and other group mem- 
bers. The Bales’ (1950) interpersonal coding schema 
was used to code the subjects’ interactions. This 
schema appears to have adequate reliabilities (cf. 
Katzell, 1966). All computed percentages for each of 
Bales’ categories were based on the group’s total 
number of interactions and were transformed to arc- 
sin scores for analysis, as recommended by Bales 
and Hare (1965). The reciprocal of time to comple- 
tion in seconds over both games was used for analy- 
sis, since this type score transformation is recom- 
mended for time scores (Edwards, 1960). A constant 
of 100 was added to the total payoff scores of each 
group to preclude negative scores. 


RESULTS 


Checks on the experimental manipulations 
indicated that the experimenter’s tallies on the 
Bales’ categories for all groups were reason- 
ably objective and that no transformations of 
the frequencies he arrived at were needed 
(for details, see Levine, 1970). 

Extensive checks were made on whether 
the control structure conditions that were 
manipulated were actually obtained. These 
checks consisted of judgments by an indepen- 
dent observer (who was ignorant of actual 
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TABLE 1 
CLASSIFICATION OF INDIVIDUAL SUBJECTS INTO 


HIGH- AND Low-CoNTROL CATEGORIES RY 
AN INDEPENDENT OBSERVER 


Observer's assignment 


Actual E l 
assignment = 
| High control | Low control l'otal 
High control | 89 7 96 
Low control | 12 84 96 
"Total 101 91 192 


Note, Level of agree 
120.66 (p < 005); = .7 


ment (Kappa) 9.80; x? corrected 
role assignments) and analysis of interaction 
process scores at both the group and indi- 
vidual levels, as well as the perceptions of 
group members. Table 1 contains evidence de- 
rived from the first method. 

The judge seemed easily able to classify 
Subjects in accordance with their actual as- 
Signment. The level of agreement corrected 
for chance (Kappa) is .80 (Cohen, 1960), 4 
is .79, and chi-square is highly significant 
when the judge's assignments are compared 
with actual assignments, As predicted, the 
check on problem difficulty revealed that easy 
tasks took less time to complete (p < .001) 
and resulted in higher payoffs (b < .001) than 
difficult tasks. Other evidence confirmed that 
the variations in control conditions did occur. 
Additional variables that might have had some 


TABLE 
ANCE EFFECTS 


Main Anatysis or Vari 
= 
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effects on the data were considered. These in- 
cluded sex of group members, problem se- 
quence, effect of similarity in role behavior 
upon ratings, and subjects’ guesses about the 
hypotheses. All were found to have little in- 
fluence on the data. 


Effects of Variations in Control Structure on 
Group Outputs & Behavior 


In view of the multivariate nature of the 
data, the interrelationships among the relevant 
variables must be considered before the re- 
sults of the tests of the hypotheses are given. 
The pattern of the correlations indicates that 
the output variables (criteria) are positively 
interrelated, but not to the extent that differ- 
ential effects of the independent variables are 
precluded. The relationships of process and 
output variables suggest that good perfor- 
mance and/or satisfaction are positively re- 
lated to solidarity, agreement, and (surpris- 
ingly) disagreement, but negatively to tension 
and aggression. 


Main Effects 


Table 2 contains the main effects of the 
experimental conditions on the relevant varia- 
bles. 

Amount of control appears as a potent cause 
of group differences, with time to completion, 
payoff, and satisfaction more favorable under 
high-control conditions. High-control groups 


2 


ON THE VARIABLES OF 


INTEREST 


Amount of control — | Distribution Task difficulty 
Variable -— = | — 
High | Low ; ly | | | ; 
" n sh E | F Balanced Unbalanced | F Easy | Hard F 
Output | | E d BS NEZ cxx MN 
Pineda completion | 29.17 | 45.72 | 19.20%% 41.28 34.61 0.98 | 3143| 4446 | 20.84*** 
ats To |2097) Ste | asso EE 541* | 20144 | 178.72 | 20.877" 
| 0995 1584 | 15309 | 20 18.50 — 537* | 2034 | 18.56 | 4.68" 
Solidarity 15.19 | 134: 
Agreement 17.09 os ELS | 14.26 14.36 0.04 | 14.50| 14.003, 1.23 
; | 1478| 16.48 | 151 16.25 1.28 | 16.90| 14.97 | 11.59** 
Disagreement 8.05 5.29 | 36.01+** 6.95 Pu | ees pase HE 0.02 
SCR ! 30.67 34.54 | 14.1399 3154 ae res Eid 3363| 3.98" 
j Span 9.79 | 10.89 2.35 | 10.39 10.29 0.02 " 9.92 | 10.76 | 1.40 
* Mean arcsin scores, 
b <05 
**b c. 
sp £001. 
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TABLE 3 
ERACTIVE EFFECTS OF AMOUNT AND DISTRIBUTION OF CONTROL 
| Amount of control 
| High low 
Variable | p - — - —— — 
| Distribution of control 
|-—— —— T RE 2 
Balanced Unbalanced Balanced Unbalanced | F 
Output E 
Time to completion 25.71 32.63 56.85 | 36.60 | 9,05*** 
Payoff | 283.56 236.81 227.63 | 192.31 0.11 
Satisfaction | 2268 19.50 i8190 | — 1730. | — 219 
Process* | | 
Solidarity | 14.97 | 13.75 | 
‘Tension release | 13.85 | 15.97 
Agreement | 17.22 | 15.28 | 
Gives suggestion 20.16 | | 17.79 j 
Gives evaluation 21.19 19.30 . 
Gives information 21.03 18.19 19.88*** 
Asks information 10.26 | 10,10 14,03€» 
Asks evaluation | 10.56 12.86 6.33** 
Asks suggestion | 4.10 5.90 17.83*** 
Disagreement | 7.14 . 5.63 naan 
Tension | 31.54 33.28 35.81 0.15 
Aggression | 10.37 11.56 10.22 3.05 
Density 49 ET B s.60*** 


a Mean arcsin scor 
! 


es, except for Density. 


Seb. 5. 

**p en. 

ore solidarity and agreement, less 
cally, more disagree- 
pport has been given 
tion of Hypothesis 
int of total con- 
d satisfac- 
emotional 


also show mc 
tension, and, paradoxi 
ment, Thus, partial su 
to Hypothesis 1 and a por 
5; that is, the higher the amou 
trol, the better the performance an 
tion, and the more positive the socio 
behaviors. 

The effects of distribution of contro 
less salient. Although balanced groups show 
higher payoffs and more satisfaction, the only 
process affected was tension, which showed up 
less in balanced groups. These findings lend 
partial support to Hypothesis 2anda portion 
of Hypothesis 5: that is, a balanced distribu- 
tion of control leads to better performance 


and satisfaction and to more positive socio- 


i haviors. a 
TUS also had effects in addition 
to those expected (higher payolís and ees 
completion times for easy as opposed to p 
cult tasks). Groups Were more satisfied, 


] are 


showed more agreement, and were less tense 
when tasks were easy. 


Interactive. Effects 

The interactive effects of the two control 
structure variables on outputs and group proc- 
ess appear in Table 3. 

The single significant interaction in the out- 
put criteria suggests that the significant main 
effect of amount of control on time to com- 
pletion must be qualified by the kind of dis- 
tribution of control. When amount of control 
is high, the balanced groups complete the 
games somewhat faster; but when amount of 
control is low, the unbalanced groups are 
quicker. 

It should be noted that the low-amount-of- 
control — balanced-distribution groups are the 
only ones with no high-control role player. 
Therefore, the expectation is that a significant 
interaction could be explained on this basis 
alone, as follows: Using a comparison based 
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TABLE 4 


INTERACTIVE EFFECTS OF AMOUNT OF CONTROL AND TASK Dirricutry 


1 
| 


Variable zere » 
| 
Easy Difficult 
— | aan 
Output | 
Time to completion | 20.00, 38.33 
Payoff | 300.50 219.88 
Satisfaction 2119 — | 20.94 
Process* 
Solidarity | 15.67 | 14.71 
Tension release | 13.23 13.87 
Agreement 18.24 15.93 
Gives suggestion | 2198 | 19.30 
Gives evaluation 21.13 21.63 
Gives information | 21.43 22.60 
Asks information 9.62 9.30 
Asks evaluation 9.88 | 10.48 
Asks suggestion 3.29 4.12 
Disagreement 7.57 8.53 
Tension 29.94 31.41 
Aggression 9.16 10.43 
Density 54.73 53.99 
a Mean arcsin scores used, except for Density, 
* p <08. 
^ p <.05 
ve DSI 


on the number of high 
each of the four types 
between balanced-low 
control-related variab] 
difference to be expec 
high versus unbalanc 
tions of a Significant 
interaction must take this into account. 

Nevertheless, the interactive effect on time 
to completion is disproportionately large. Al- 
though the differences are slight (from about 
4 to 7 minutes) between balanced-high, un- 
balanced-high, and unbalanced-low groups 
When compared in this order of decreasing 
amount of control, the difference bet: 
balanced-low and bal. 


-control role players, in 
of group, the difference 
and unbalanced-low on 
es is the reverse of the 
ted from the balanced- 
ed-high. All interpreta- 
Amount X Distribution 


presence of at 


an effect group deci- 
sions. 


Of the significant interactions 


in group 
process, only three seem to excee 


d the ex- 


Amount of control 


Low 


Task difficulty 


Easy Difficult F 
42.86 50.59 9.93*** 
282.38 137.56 3,31* 
19.50 16.19 3.46* 
13.52 13.35 0.60 
17.96 18.77 | 0.01 
15.56 14.00 0.42 
| 16.45 16.26 5:85" 
19.96 18.16 | 4.22** 
17.05 16.19 3.15* 
11.53 11.09 0.01 
14.99 13.74 1.08 
8.80 &79 | O7 
5.83 4.75 4.91** 
33.22 35.86 0.32 
10.68 11.09 0.35 
45.74 38.04 2.38 


pected interaction limits. The balanced-low 
groups show more tension release, asks sug- 
gestion, and density of interaction than we 
might expect in moving stepwise from groups 
with three high-control members to groups 
with none. No qualification of amount of con- 
trol main effects is indicated here, since all 
significant interactions are of the form to be 
expected from the manipulation of amount of 
control. But the presence of the main effects 
of distribution of control not shown in the 
present set of tables which reveal that bal- 
anced groups show more tension release, ask 
for more Suggestions, and interact more a 
proaching Significance) should be ascribec 
mainly to the somewhat aberrant balanced- 
low cell. ion 
The results do not support Hypothesis he 
r the only significant interaction among t é 
Criteria is in a direction opposite to the en 
predicted. As stated, the distribution of ge 
trol made a greater difference in time score! 


fo 
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when the amount of control was low; how- 
ever, the unbalanced groups were quicker. 

. Table 4, which contains the data on the 
interactive effects of amount of control and 
task difficulty, reveals some interesting infor- 
mation. 

The outputs are somewhat contradictory. 
Time to completion indicates that high 
amount of control is more facilitative when 
the task is easy than when it is difficult; but 
the payoff differences (approaching signifi- 
cance) reveal that the high-control groups are 
relatively more effective than low-control 
groups when the task is difficult. The latter 
trend also occurs when group satisfaction is 
considered. 

The group process data are very enlighten- 
ing. Groups with high control give relatively 
more suggestions when tasks are easy. This 
finding may explain their relatively quick so- 
lution time; having figured out the game 
quickly, they can focus on completing their 
trials. Difficult tasks, on the other hand, cause 
problems. More evaluation and information 
are necessary to solve them. High-control 
groups give relatively more evaluative state- 
ments and information (approaching signifi- 
cance) than low-control groups, and more 
disagreement occurs among high-control 
groups when tasks are difficult. Finally, there 
is some propensity for high-control groups to 
work harder (in terms of the frequency of 
their interaction) than low-control groups 
when tasks are difficult. It may be, therefore, 
that these activities lead to relatively better 
payoffs for high control, as opposed to low, 


when tasks are difficult. This statement offers 
some justification for Hypothesis 3—amount 
asks are 


of control tends to count more when t 


difficult. we 
The interactive effects between task diffi- 
‘ol are notable 


distribution of contr 


d 
culty an seriority of balanced 


by their absence. The sug € 
qu unbalanced groups on payoff and satis- 
faction holds up equally well regardless of v 
ease or difficulty of the task. Only one tren ; 
the somewhat greater degree of exchange © 


information by balanced groups when € 
are difficult (P < 11), offers due ae le 
for the prediction (Hypo 2 Las bd 
anced groups would perform bette 
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cult tasks relative to unbalanced groups, since 
results have shown that giving information is 
positively related to performance. 

Levine and Weitz (1971) have demon- 
strated that Hypothesis 3 is more strongly 
supported, both for amount of control and dis- 
tribution, when trial-by-trial data rather than 
overall payoff scores are utilized. 


Discussion 


Two questions have been raised by the 
results: (a) Why have high-amount-of-control 
groups and  balanced-distribution groups 
achieved higher group payoff scores and more 
satisfied members than their counterparts? 
(b) Why were high-amount-of-control groups 
faster at completing the game? An answer to 
the questions may be attempted at two levels, 
the interpersonal and the individual. 

First, at the interpersonal level, the author 
suggested a theory earlier that control struc- 
ture may be linked to output by means of 
several functional concomitants. To reiterate, 
a high amount of control and a balanced dis- 
tribution of control should lead to increased 
striving for goals, better communications, and 
decreased conflict—the result being better per- 
formance and satisfaction on the part of the 
group or organization. In this experiment, 
these functional variables were represented by 
selected categories in Bales’ IPA which were 
not manipulated by the role instructions, 
namely, aggression, agreement, disagreement, 
tension, and solidarity. 

For amount of control, the predictions were 
largely supported, since high-amount-of-con- 
trol groups showed less tension and tended to 
show less aggression (p < .13) and more soli- 
darity and agreement than low-amount-of- 
control groups. However, contrary to this hy- 
pothesis, “high” groups also had a higher rate 
of disagreement. 

By way of explanation for this result, find- 
ings on disagreement indicated a positive rela- 
tion to payoffs and satisfaction. These findings 
suggest that the “freedom to disagree” is im- 
portant to the needs of group members and to 
task accomplishment. It may be that disagree- 
ment in the context of the task is functional 
in this study; however, when groups fulfill 
other than task goals, disagreement may be- 
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come more exclusively socioemotional in na- 
ture and assume its expected role as an indi- 
cator of conflict. Such an interpretation is 
weakened somewhat by the pattern of results 
on disagreement for the four control structure 
conditions. The high-amount-of-control — bal- 
anced-distribution groups disagreed most, as 
might be expected based on their equal status, 
but the low-amount-of-control — balanced-dis- 
tribution groups disagreed least. This may be 
due to two factors: (a) These groups took so 
long to decide and to come up with ideas that 
disagreement about anything was discouraged. 
(5) The emergent task accomplisher(s) in the 
group had a great deal of time to evaluate 
various approaches in their own minds, and 
disagreements were less likely to show up as 
overt behaviors. 

For distribution of control, the evidence on 
socioemotional facilitation was less compelling. 
The better quality payoffs and satisfaction 
achieved by balanced groups as opposed to 
unbalanced groups would seem to rest solely 
on the single finding which supported the 
hypothesis—that balanced groups showed less 
tension. This seems an unlikely supposition ; 
rather, the interpretation, based on the au- 
thor’s observations of these groups, that bal- 
anced groups allowed for freer expression of 
ideas, especially by those members who had 


more task-relevant ability and interest, seems 
more reasonable, 


oup. Since ability 
hot have been operative, 
assignment of individuals 
nditions, the most likely 
ter individual performance 


there are Several possible 


account for increased 
motivation when individuals h 


say in decision making 
equally, 

Vroom (1964), for one, suggests “ego-in- 
volvement.” When an individual helps to make 
a decision on some particular issue, then the 
success or failure of the decision tests the ade- 


ave a greater 
and share the influence 
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quacy of his self-concept. Failure becomes 
threatening and success rewarding, 

Brehm’s (1966) theory of psychological 
reactance also seems relevant. Here, reactance 
is viewed as a motivational state directed to- 
ward reestablishment of an eliminated free- 
dom. If we conceive of individuals in the low- 
control condition as having their freedom to 
actively engage themselves in group discus- 
sion taken away by the role instruction “do 
not participate actively,” then reactance 
would be engendered. However, since sub- 
jects could not really reestablish their elimi- 
nated freedom, because their contract with 
the experimenter would not allow this, a 
means of reducing the reactance tension might 
be the derogation of the value of solving the 
“insignificant” problem. 

What about the effect of distribution in a 
reactance framework? In the unbalanced con- 
dition, where groups were composed of high- 
and low-control role players, the freedom of 
low-control role players to engage in group 
discussion was hampered both by the experi- 
menter and by the presence of high-control 
players. Therefore, the sum total of reactance 
would seem to be greater among low-control 
subjects in the unbalanced-distribution condi- 
tion, Furthermore, in the low-amount-of-con- 
trol - unbalanced-distribution groups, the 
high-control role player who was faced with 
two rather silent partners may have felt reac- 
tance as a function of his inability to sit back 
for a moment and let someone else take over 
the discussion. Theoretically, then, it might 
he expected that the low-amount-of-control — 
unbalanced-distribution groups in which both 
high- and low-control subjects felt reactance, 
would perform most poorly and be the least 
satisfied set of groups. This did, in fact, occur 
in terms of the quality of their performance 
(group payoff scores) and satisfaction. They 
were also the second Poorest set of groups in 
their time to completion. (The low-amount- 
of-control — balanced-distribution groups were 
the slowest) 


The results of this study, of course, cannot 
distinguish among the explanations offered for 
increased individual motivation, but they may 
be considered as corroborative of any, or all, 
^f those positions that would lead to the hy- 
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potheses tested and supported by this study. 
The same statement could apply at the inter- 
personal level, since other theoretical positions 
would lead to essentially the same hypotheses 
(cf. Vroom, 1964). 


CONCLUSION 


Several limitations of the experiment were 
treated at some length elsewhere (Levine, 
1970). Among them were the problems of the 
possible confounding influences of competence 
structure, fatigue, the experimenter’s expecta- 
tions, the necessity for the experimenter to 
code his own tapes, the lack of independence 
between the manipulations of amount and 
distribution of control, and the possible ef- 
fects of the role instructions themselves in 
producing the results. The one that produced 
the most difficulties for interpretation of the 
data, it was found, was the lack of indepen- 
dence in the manipulation of amount and dis- 
tribution of control. This seriously weakened 
the findings on the Amount X Distribution 
interaction. Further research is needed to 
assess their joint effects. 

The main reservation about the quality of 
this “simulation” per se concerns the power 
base used to effect the control structure ma- 
nipulations. Although it was successful, it does 
not seem to represent the usual basis of 
cially of power discrepancy, in or- 
amely, legitimate power or au- 
is pertinent if it can 
f power dis- 
and per- 


power, espe 
ganizations—n 
thority. This objection 
be shown that different sources 0 
crepancy have different functional 
formance concomitants. 

Collaros and Anderson (1969) found that 


status differences established by means of 
creativity. The present 


expert power impeded e 
study’s results showed that group pro! m 
solving quality was better when all were © 
equal status. If some amount of creativity 15 
necessary to achieve higher payoffs, then Y 
manipulation of power discrepancy may be 
seen to be similar in effect to a power Tr 
ancy set up ina completely different bes ion, 
although still not one that. reflects age 
tional authority. Bridges, Doyle, and Mahan 
(1968) varied group power distributions 1n 
just such a manner; that is, 


by difference in 
hierarchical rank. They created two kinds of 
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groups, either by using one principal and three 
teachers or by using four teachers. Those 
groups composed of one principal and three 
teachers were less productive, less efficient, 
and less inclined to take risks. These results 
tend to support those of the present study. 
Unfortunately, however, in neither of these 
experiments is it possible to assess directly 
and unequivocally whether the results were 
due to amount of control, distribution of con- 
trol, or some combination. Nevertheless, it 
would appear that distribution of control was 
primarily involved. 

The present study’s findings are in line with 
those of research studies of a survey type done 
in actual organizations. This convergence sug- 
gests that the results deserve careful consider- 
ation in the design of effective groups and 


organizations. 
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Previous attempts at assessir 
directed. toward management 
were constructed to measure 
such as union officials and pi 
and testing. Reliabilities of -81 and 
and .95 and .92 for Likert scales. 
extreme "known groups." 


Psychological testing has attracted a good 
deal of adverse public reaction in the past 
decade. Progress in industrial psychology may 
be greatly affected by the attitudes of its 
special public, the management and workers 
of industry, and by the attitudes of the 
greater public which may shape new legis- 
lation and restrictions detrimental to the 
field’s existence. 

In surveying 
attitudes of particular groups 
trial psychology and testing, one is 
the fact that these studies have mainly con- 
cerned themselves with the attitudes or opin- 
ions of management personnel (Stagner, 
1946; 'Thornton, 1969). These studies indi- 
cate that, generally; management is favorable 
toward the application of psychological 
principles in industry. 

An article by Shostak (1964) of i 
insight into the worker's view of aere 
psychology. He examined the relations! "i 
the industrial psychologist and trade m 
and termed it “a matter of mutual sri 
nce.” In examining Why industrial psyc 10 = 
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All scales success 
Issues dealing wit 


ng attitudes toward applied psychology have been 
groups. Two Thurstone and two Likert scales 
attitudes held by widely diverse work groups, 
ersonnel managers, toward industrial psychology 
.84 were obtained for Thurstone scales 


ally discriminated among 
h construct validity were discussed. 


labor-management issues, he offered the 
following possible explanations: (a) business- 
men sponsors did not call for research on the 
topic; (b) career considerations and method- 
ological problems discouraged such research; 
and (c) the unions themselves showed no 
interest in such research and were suspicious 
of, if not hostile to, industrial research on a 
psychological level. 

Several articles dealing with a 
ward psychological testing in industry have 
also appeared in the literature (Brim, 1965; 
Ward, 1960). There seems to be a consensus 
that most unions and their members are anti- 
test in their attitudes just as most business 
executives are assumed to support their use. 
This idea of a continuum with management 
positioned at one end and union officials at 
the other may be a valid description. How- 
ever, the almost total lack of research into 
the attitudes of the blue-collar worker con- 
cerning testing and psychology makes any 
attempt at such a description very difficult. 

The intention of the present study was to 
measure objectively the attitudes in question 
by constructing valid and reliable instruments 
using groups drawn from both management 
and labor populations. The two most common 
attitude scaling procedures— T hurstone and 
Likert—were examined, and the decision was 
made to utilize both methods due to the 
advantage offered by multiple measuring in- 

ting convergent and discrim- 


struments in tes 
inant validities (Campbell & Fiske, 1967). 
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‘TABLE 1 
NUMBER AND OCCUPATIONAL CATEGORY oF SUBJECTS 
IN GROUP 3—Likert Item SELECTION GROUP 


Blue | 
| collar 


Manage- 


|mentand| White Blue 


collar 7 | collar , Total 


Subgroup E son i 
nonunion) poa- | union | 
| | union 


3a (industrial | 
ychology 


scale) | a 8 1 38 | s 
3b (psychological y 
testing scale) 56 12 1 M | 103 


attitudes toward industrial psychology and attitudes 
toward psychological testing. Two parallel forms of 
scale development were used, Likert and Thurstone 
techniques. It was felt to be important to develop 
these scales using a broad range of industrial groups. 
Such an approach would enable the determination 
of scale usability in terms of their ability to discrim- 
inate among attitudes held by divergent groups. 


Subject Samples 


Group 1—Initial item pool selection group. This 
group comprised 183 adult male volunteers drawn 
from a larger sample of 350 members of the United 
Steel Workers of America, American Federation of 

and Congress of Industrial Organizations 
(AFL-CIO), attending training courses at the 
Pennsylvania State University, 

Group 2—Thurstone judging 
comprised 104 male volunteers 
psychology classes at Pennsylvania Stat 


ct) were sent out to the various 
Table 1. The response rate was 
very low (20%) and comments indicated that the 
consuming. An additional 
100 forms with 100 
itude object and an- 
with statements relating to the other 


were 
management develop- 


held at Pennsylvania 
maining subjects were 
n industries located in 


State University, while the ps 
obtained from Work crews i 
Reading, Pennsylvania, 
Groups 4a through 4g—Final scale de 
groups. With the exceptions of g 
subjects were obtained from co; 
ducted at Pennsylvania State U; 
Group 4a—This group co: i 
officials. Their rank on | c PS 


velopment 
roups 4d and 4f, all 
nferences being con. 
niversity, 


oup comprised 39 i 
activists. Activists Were defined as male union 


s A A union 

holding either a minor union office or no aan 
Group 4c—This ETOUD consisted or 36 5 all. 

union activists. 79. female 
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Group 4d—This group comprised 22 male and 2 
female personnel managers. Scales were originally 
sent out to 97 personnel managers in the Pittsburgh, 
Pennsylvania, area. They were randomly selected 
from the Dun and Bradstreet Million Dollar Direc- 
tory (1971). 

Group 4e—This group was made up of 35 male 
union officials of higher rank (as defined in Group 
4a) of the United Steel Workers of America, AFL- 
CIO. 

Group 4f—This group comprised 27 students (24 
male and 3 female) enrolled in an upper level ind 
trial psychology course at Pennsylvania State Uni- 
versity. 

Group Jg—'This group consisted of 31 male union 
activists (as defined in Group 4b) in locals of the 
United Steel Workers of America, AFL-CIO, 

Thus, Group 4, comprising seven subgroups, totaled 
211 individuals, of whom 27 were full-time students 
and the remainder were 184 working adults employed 
in business and industry in various management and 
labor positions. 


Item Samples 


Items for inclusion in the initial attitude scale item 
pools were obtained from several sources. Group 1 
completed five open-ended questions which were part 
of a larger questionnaire, Their replies to these 
questions were scanned and rewritten as attitude 
statements. Other items were chosen by a selection 
of statements which Were pertinent to the attitude 
objects under consideration from books, magazines, 
and newspapers. 

This process yielded more than 200 items for each 
of the two attitude objects. These items were refined 
to conform to criteria suggested by Edwards (1957). 
In addition, due to the diverse nature of the subjects’ 
backgrounds, only those items which could be clas- 
sified as “fairly easy” or “standard” as defined by 
Flesch (1948) were included in the item pool. Finally. 
an attempt was made to balance the number of 
Positive and negative statements for each list. Using 
these guidelines, two lists of 100 attitude statements 
each were constructed (one for cach attitude object). 

These two 100-item lists were used for both the 
Thurstone and the Likert scale development. Thur- 


d by subjects in Group 


and Likert sealing procedures were 
conducted in the traditional manner (Edwards, 
1957). Each scaling Procedure yielded 48 items (24 
for each attitude object) for a total of 96 items, The 


final scales Were administered to each of the seven 
subgroups in Group 4,2 


. REsuLTS 
Reliability 
Internal consi. 


stency estimates of the Thur- 
Stone scales w 


ere obtained using a split-half 


* Copies of the Scales are available from Frank J- 
Landy upon request, 
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technique applied to odd and even items. The 
resultant reliability estimates were corrected 
by means of the Spearman-Brown formula to 
obtain an estimate of the reliability of the full 
length scales. These estimates were derived 
from the responses of the subgroups in Group 
4. For the "attitude toward psychology" scale, 
the corrected coefficient for the total group 
was .81; the individual subgroup values 
ranged from .66 to .85 (with the exception of 
4d, which yielded a value of .38). For the 
"attitude toward testing" scale, the corrected 
coefficient for the total group was .84, with 
subgroup values ranging from .75 to .91. 
Internal consistency estimates of the Likert 
scales were obtained using the Kuder-Rich- 
ardson Formula 20. For the attitude toward 
psychology scale, the reliability estimate for 
the total group was .95, with subgroup values 
ranging from .87 to .96. For the attitude to- 
ward testing scale, the reliability estimate for 
the total group was .92, with subgroup values 
ranging from .85 to .94. The average inter- 
item correlations for the Likert scales, respec- 
tively, were .42 and .33 (Group 4, n = 211). 


Validity 

The primary technique for establishing 
validity was the “known groups" method 
(Scott, 1969) in which the instrument is 
administered to several groups of subjects, 
one of which can be confidently assumed to 


possess the attitude attribute to a greater 
degree than the other groups. : 
e results for each 


In order to compare th 1 an 
subgroup on each scale to determine whe 
or not there were any significant differences 


is of 
among the means, à one-way ae tes 
variance for each scale was performe QW 
levels being the various subgroups in iw 
4. Significant F values were found in eac 
» 5 


the four analyses: 


(pf = 8.72, df = 

Thurstone—psychology | (F an $. 
6/204, p « .001, estimated o pit 
"Thurstone— testing q > E = ; 
p < .001, estimated o^ = 1055, nay 
Likert- psychology (F = $88 d = /201, 
b < 001, estimated o^ = ^^^ 6/201, ? 


= 439, df = 


Likert—testing (F 2 = ,09). 
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‘TABLE 2 
MurrieLE Comparisons OF GROUP MEANS 


Thurstone scales* Likert scales? 


Item | 
= 1 
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Attitude towar 


psychology 4d > 4g, da, de, de, 


1 
| dd «da, dg, de, db, 


ES 4b 
Af <da, dg, 4e, 4b. | 4f > 4g, da 


| 3c «4a | 
| 
Attitude toward | | 
testing Ad < 4a, dg, 4b Ad > 4g, 4a 
A n, 4g | AE S 4g, da 


Note. All multiple comparisons were computed using the 
Newman-Keuls technique. Only significant differences (P 


əwer scores indicate more favorable 


attitudes, 


To look at specific subgroup differences, New- 
man-Keuls multiple-comparison tests were 


run, The results appear in Table 2. 

Tt was also felt that the notion of con- 
vergent validation might be useful in the 
present context. In this procedure, scores 
obtained from two different instruments aimed 
at the same construct are compared to deter- 
mine whether the proposed instrument is 
actually measuring the intended attribute 
(Campbell & Fiske, 1967). The multitrait 
multimethod matrix appears in Table 3. i 


DISCUSSION 


On the basis of the previously cited liter- 
ature and the known groups validation, it 
seems that the scales constructed have met the 
minimum requirements of an adequate atti- 
tude scale. An underlying hypothesis of the 
present research was that the various samples 
did not come from a homogeneous population 
with respect to attitudes toward the objects in 
question. The results indicate that the vari- 
ous subgroups respond differentially to each 


TABLE 3 
INTERCORRELATION MATRIX -SCALE SCORES 
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Attitude scale 


Attitude scale* 
ee LEE 


Likert 1 | Likert 2 | Thurstone 1 


Likert 2 |. 486 jd 
Thurstone 1 .69 62 | P 
‘Thurstone 2 64 .64 | ES 


a Mlobtained coefficients significant at the M level, 
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scale in the direction and magnitude originally 
expected, with only minor discrepancies. 

The personnel manager group was expected 
to take a more favorable position toward the 
attitude objects because of certain experiential 
factors inherent in their individual roles and 
in group membership, Likewise, the union offi- 
cer groups were expected to hold unfavorable 
attitudes due to lack of familiarity with psy- 
chology and science in general, distrust of 
management-imposed methods, and past in- 
volvement in grievance Cases concerning test 
results versus seniority. The activist groups 
Were expected to be more moderate in their 
views than the union officers, but on the same 
end of the continuum, while the students were 
expected to react similarly to the personnel 
managers, 

These results seem to indicate that the 
scales have been demonstrated as valid, at 
least for use in research with extreme groups 
in the desired subject population. It is impor- 
tant, however, to keep in mind the fact that 
certain groups from the population specified 
(industrial work force) were not included in 
testing the scales (e.g., rank and file workers 
not active in the union, unorganized Workers, 
lower level white-collar workers, etc.), For 
some or all of these groups, the attitude scales 
might possess different Properties than those 
demonstrated in the present study. 

The question of the relative superiority of 
the two sets of scales is difficult to answer, If 
we look solely at reliability, the Likert scales 
seem clearly superior, On the other hand, if 
we make the Judgment on the basis of known 
stoup validity, 
slight edge. From the results presented in 
Table 2, it appe 
are capable of 
groups. If we 
(Table 3), the Lik 
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ogists; that is, they are really the same 
attitude object. fi 
Another reasonable explanation would be £ 
that the intercorrelation can be accounted for 
by method variance. It may be, as Seiler and 
Hough (1970) suggest, that item pools should 
be constructed independently for the Likert 
and Thurstone scaling procedures due to the 
alleged theoretical monotonic-nonmonotonic 
nature of the two methods, as opposed to the 
common item pool method used in the pres- 
ent study. It may also be, as suggested by 
Scott (1969), that maximally dissimilar in- 
struments are necessary if their correspond- 
ence is to be attributed primarily to common 
content. 


As has been previously noted in the liter- | 


di. 


ature, the Thurstone scales are more distaste- 
ful to complete, due to their either/or response 
format. Many of the personnel managers 
(Subgroup 4d) completed only the Likert 
scales, in spite of the fact that in half of the 
questionnaires the Thurstone scales came first. 
It is possible that this was a contributory 
factor in the low-reliability estimate in the 
Thurstone psychology scale. We are at a loss 
to explain why the same phenomenon did not 
occur with the Thurstone testing scale, other 
than an interaction with scale content. 

For these reasons, we are inclined to favor 
the Likert scales, although the true test would 
involve identifying separate item pools prior 
to scaling, as mentioned above, 

Theoretical considerations aside, the major 
purpose of this study was seen as being ful- 
filled with the development of several attitude 
scales suitable for use with industrial work 

groups. They should prove useful for group 
research in industrial applications and could 
Serve to produce both antecedent and con- 
sequent variables for evaluation, 
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The performance of patrolmen in three matched public housing projects was 
compared on a number of different criteria. Patrolmen in one project received 
special affective-experiential training, patrolmen in the second project. re- 
ceived special cognitive training, and patrolmen in the third project received 


no special training. The general leve 


l of performance by affective-experien- 


tially trained officers was significantly superior to that of the officers who 
functioned in the other two projects. Implications of these findings for 
police training and for action research are discussed. 


In recent years an increasing number of 
police departments have modified and ex- 
panded both their recruiting and in-service 
training programs. Many of these efforts have 
been directed at the improvement of the police 
officer’s capability for managing social and 
interpersonal disorder, 

Unfortunately, most police departments do 
not have the in-house capability for ade- 
quately evaluating their own training inno- 
vations. As such, many training programs 
have been instituted, if not actually institu- 
tionalized, with minimal evaluation. Even 
when programs are evaluated, it is often solely 
on the basis of participants’ responses to ques- 
tionnaires, Such procedures, as well as all 
paper-and-pencil measures, have several 
shortcomings in evaluating police training 
(Zacker, 1972) and should be considered as 
supplements to procedures that more directly 
assess the specific behavior : 
to modify. 

The Program assessed in this study was 
derived from an earlier project that involved 
the training of police in family crisis inter- 
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the training seeks 


appreciation to 
» for his Pains- 


vention (Bard, 1970b). Although that proj- 
ect resulted in a successful demonstration of 
the feasibility of a Family Crisis Interven- 
tion Unit, several important questions were 
raised that the present study was designed to 
answer: Did the training methods used have 
differential effects? To what extent would 
training affect day-to-day police operations? 
And, importantly, would the innovative train- 
ing elements be applicable with newly selected 
police recruits? 

The two training conditions compared 
represent different teaching models: (a) affec- 
tive-experiential training stresses active in- 
volvement, learning while doing, and mon- 
itored practice in the field and (5) cognitive 
training stresses a lecture format in which in- 
formation is imparted to a group of students 
in a more passive-receptive mode of response. 
If the task requirements of a particular job 
call for active decision making in complex sit- 
uations (as in police work), will training 
that enhances active participation have dif- 
lerential effects upon job performance than 
training experiences that promote a passive 
mode of participation? 


METHOD 
Subjects 


Subjects were 54 probationary patrolmen (an 
entire class of recruits) at the New York City 
Housing Authority Police Department Police Acad- 
emy.3 In addition, 6 experienced officers received 
training with one group. 

3 The training facility for this 1,500-man depart- 
ment, Members of this department have the SE 
police powers as those of the larger municipal police 
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During the first day these recruits were at the 
Police Academy, they were told that they were to 
take part in an experiment to evaluate two different 
methods of training, each intended to increase the 
policeman’s effectiveness and safety through increased 
knowledge about human behavior. To control for 
motivation and skill level, subjects were then ran- 
domly divided into two groups: affective-experiential 
training and cognitive training. Thus, 30 patrolmen 
were randomly assigned to the affective-experiential 
training group (24 recruits plus the 6 senior patrol- 
men) and 30 to the cognitive training group. 


Training 

Police Academy training. Each subject underwent 
the normal 12-week program at the Academy. To 
accommodate the additional training this recruit 
class received, their Academy training was extended 
1 additional week. Academy training is provided 
primarily by police officers who are staff members 
of the Academy. 

Affective-experiential group training. This group 
met at the Psychological Center of the City College, 
City University of New York, on 12 successive 
Tuesday mornings, during which they received a 
total of 42 hours of training concurrent with their 
Academy training.* Subjects spent much of this time 
in one of two small groups, the leaders of which 
were graduate stent, in the Lee Ve 
jsvchology program. Occasionally serving 79 . 
Jem us rebis of the Thirtieth Precinct 
(New York Police Department) Family Crisis US 
vention Unit (Bard, 1970b), each of whom Ba an 
experienced patrolman aere with small-group 
discussions and real-life simulations. = 

Training procedures for the affective-experiento 
training group included group Lee i : 
Simulations of interpersonal conflicts, role dom A 
lectures which were all designed to imp 


j interpersonal conflicts by 
subjects’ ability to manage interper b 
adan early jences that promoted active 
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i n we patch the 
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i s permanen 


ert, 
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conflicts which had or might require police interven- 
tion; most of these situations either had occurred or 
might occur outside the group and were discussed 
primarily in terms of the police officer’s role. Also, 
the aura of expertise usually placed by participants 
upon (and accepted by) leaders of other forms of 
groups was discouraged by the two civilian group 
leaders, who divested themselves of any impression 
that they had competence in technical police matters 
or in the law—the only competence they projected 
pertained to human behavior. 

Cognitive group training. The cognitive group met 
at the Psychological Center on 12 successive Wednes- 
day mornings, during which time they also received 
a total of 42 hours of training concurrent with their 
Academy training. In order to provide for cognitive 
group training that was similar in form to usual 
Academy classroom training, the lecture format con- 
stituted the primary teaching method. The curricu- 
lum covered aspects of psychology, sociology, and 
physical and social anthropology which were all 
designed to provide a well-rounded view of human 
motivation and behavior. Fourteen instructors 
geared their presentations to cover major police- 
relevant issues and trends in these areas. 


Assignment to Housing Projects 


At the conclusion of the recruit training period, 
14 of the 24 affective-experiential-training recruits 
were randomly selected to staff two of the experi- 
mental housing projects: 8 were assigned to one of 
the projects and 6 to the other. In addition, each 
project was assigned 3 of the affective-experientially 
trained senior officers, who had previously been 
assigned to these projects. The remaining patrolmen 
normally assigned to these two projects were re- 
assigned elsewhere so that coverage would then be 
provided only by the men who had experienced 
affective-experiential training in conflict manage- 
ment. Only one of these projects was used for pur- 
poses of comparison, since the other was deliberately 
selected because it differed from the remaining hous- 
ing projects selected for study. —— T" 

A random sample of five cognitive-training re- 
cruits was assigned to a third housing project. All 
except four of that project's normally assigned com- 
plement were reassigned elsewhere for the duration 
of the experiment. 


T set 
A fourth housing project, c 
its internal environment, external environment, and 


level of police activity were similar to that of the 
affective-experiential- and fom € 
ects, served as a further control. Its d P 
complement of 11 senior officers was Jeft intact. 

5 The interested reader is referred to Lee rci 
and Rutter (1972) for data on comparisons e wea 
the two original afiective-experiential trai nA Du 
ects. Copies may be obtained by Ww ep Sete x 
Bard, Graduate School and Universe een a 
University of New York, 33 West 42nd * T. 
York, New York 10036. 


carefully selected so that 
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Affective-Experiential Consultation Period 


Once weekly, for a 14-week period that began 
shortly after the start of the study year, affective- 
experiential-training group officers reported to the 
Psychological Center. There they participated for 1 
hour of individual consultations about conflicts they 
had managed during the prior week and for 2 hours 
of small group discussions. The officers usually met 
with a different consultant each week. Of the con- 
sultants, 9 were female and 5 were male. Eleven 
were doctoral students in clinical psychology at the 
City College, and 3 were Fellows in Community Psy- 
chiatry at the Columbia University College of Phy- 
sicians and Surgeons.® 

Both officer and consultant were regarded as 
experts in their respective fields. By pooling their 
expertise during the consultation hour, it was 
expected that each would obtain a greater under- 
standing of the conflict intervention discussed, the 
officer's effectiveness in managing it, and alternatives 
for managing similar situations. These consultations 
were not limited solely to actual police cases, but 
included other, sometimes personal, issues. 

It should be emphasized that these ongoing weekly 
consultation sessions were considered a critical facet 
of the affective-experiential training 
an essential element of several aspects of professional 
training, that is, the practicum, the clinical, or the 
“learning while doing” (Bard, 1971). 

Subsequent to the consultation period, and for the 
remainder of the study year, there were few contacts 


between the affective-experiential-training group 
officers and the staff. 


» incorporating 


Police Performance Criteria 


There are many indexes avail 
nizations that are derived from day-to-day police 
operations. It is assumed that police administrators 
are among the best to judge “good” police work, 


Before the results of the study were known, a list of 
Police performance criteria for which data could be 
reliably obtained were presented 


mander of the Ni 
Police Departmen 
istrator’s sensitivi 
formance and to 
his department 


able to police orga- 


rance rates for total 
for misdemeanors, and for 

The consultants were 
Elsie Chandler, 
Margaret Dolid, Jeff Eagle, 
berg, Roger Graham, Bonnie K: 


Patrol, New 


T Beckel, Chief of 
Department. 


ing Authority Police 
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offenses; number of misdemeanors; number oi mis- 
demeanor arrests; number of offense arrests; danger- 
tension index;? total crime; and total number of 
arrests. 

Data for each criterion were collected for the 
following periods: (a) the 1-year period beginning 
with the assignment of affective-experiential- and 
cognitive-training group recruits to their respective 
housing projects (from January 9, 1970 to February 
8, 1971), subsequently referred to as year 1970, and 
(b) each of the two 12-month periods immediately 
preceding the study year, subsequently referred to as 
years 1968 and 1969, 


RESULTS 
Chi-square tests with Yates’ correction for 
continuity were used to make all within- and 
between-project comparisons, The relationship 
between year 1970 and the average of the two 
prior years (years 1968 and 1969) was con- 
sidered to provide a somewhat more stable 


index of change than that between year 1970 
and year 1969 alone. 


Within-Project Comparisons 


Analyses of changes are presented in Table 
1. The affective-experientially trained officers 
obtained significantly improved performance 
during year 1970, relative to the average of 
the two preceding years, on 6 of the 10 
criteria. For 5 of these 6, the 1970 rate in the 
affective-experiential-training group was also 
superior to its 1969 rate (not shown). In no 
case did the performance of the affective- 
experiential-training group decline in year 
1970 over the average of the two preceding 
years. 

The cognitively trained officers in the 
cognitive-training group obtained improved 
performance on 4 of the 10 variables. On none 
of these, however, was the 1970 rate superior 


8 Clearance rates for crime categories are calculated 
as number of incidents reported/number of arrest 
for such incidents. " 

?In considering ways of assessing morale, po 
amount of time lost because of illness loomed m 
as a measure. It became increasingly clear that ap 
time probably reflected factors related to mora a 
that is, the need for relief from danger and Ed 
9n the job. Working (often alone) in Wigh-or En 
areas could be considered to exact a toll expressed H 
tension-related absenteeism. While it is not ue 
sidered that all sick days are due to tension-relate n 
illness, it may be that many are. The danger-tens! 


i ide days 
index was calculated as total arrests/total sick d^? 
limes 100. 


Changes in year 1970 from 
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lo its 1969 rate (not shown). Cognitive-train- 
ing group members showed decreased perfor- 
mance on 2 other variables (offense clearance 
rate and number of offense arrests), and for 
each of these the 1970 rate was also inferior 
to the 1969 rate. 

The senior untrained officers in the control 
group did not improve on any of the criteria 
in year 1970. The control group declined in 
performance on 1 of the 10 variables (danger— 
tension index) from the average of the two 
prior years to year 1970 and on 4 variables 
(misdemeanor clearance rate, number of mis- 
demeanor arrests, danger-tension index, and 
total arrests) from 1969 to 1970. 
Between-Project Comparisons Of Year 1970 
Rates 

When the three housing projects’ criterion 
rates in years 1968 and 1969 were not signifi- 
cantly different, their year 1970 rates were 
compared. Of the four criteria on which 1970 
rates in the affective-experiential- and the 
cognitive-training groups could be compared, 
the affective-experiential was superior on 
three criteria (p < -05 or less)—clearance 
rates for total crime, for felonies, and for 
Offenses. There was no difference between the 
affective-experiential and the cognitive groups 
on misdemeanor clearance rate. : 

Of the six criteria for which 1970 rates in 
the affective-experiential and the cognitive 
groups could be compared, the affective- 
experiential group Was superior on four 
(clearance rates for total crime and for mis- 
demeanors, number of misdemeanor arrests, 
and total number of arrests), While on Uo 
(clearance rates for felonies and for offenses) 
there were no differences. y 

Of the six criteria for which 1970 rates In 
the cognitive-training an rol groups 
could be compared, the € 
superior on misdemeanor clearance rate, 
Control group was superior 
demeanors, while on four 
total crime, for felonies nC. 
total crime) there were no differences. 


Overall Comparisons Between Projects 


: ‘acts? 
Ranki ree housing projec 
kings of the th the average of 
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TABLE 1 


WITHIN-PROJECT COMPARISONS 
RANK ORDER OF CHANG: 


| | Ranks» 
ji (1968 +1969) /2: 
Criteriona [(19683-1969)/2: 1970 
1070 
e i Ci | c2 
crime clearance rate* i | E 
ala Ia 
| 
& [3 1 
3 2 1 
e <i e 
Number of misdemeanors? 
AE | 
39: 1 4 
f misdemeanor 
spaa 
| & i521 
| | 
1s 2 1 
580:486** 
97* 312 pi 
40 
124598 
SE E ee 
Sum of ranks lao |17 |13 
Mean of ranks [ $0]| 47] 1d 


using projects: 


fer to the resp 
cognitive train- 


ons rel 
ential training, 


gned to the project having 

favorable chan 968--1969/2 to 1970, the 

the project having the least favorable change. 

t percentages. 

“mote improved police performance for these 
oe teria, increases denote improved per- 


the most 
lowest rank (1) t 


68 and 1969 on each criterion vari- 
also presented in Table 1. For each 
the affective-experiential-training 
d the highest rank, denoting 
greatest improvement (or least decrement). 
‘Application of the Friedman test (Friedman, 
1937), which assumes neither homogeneity of 
variance nor normality of the parent popula- 
tion (Wike, 1971), revealed the sum of ranks 
to have differed 


of the three housing projects 


years 19 
able are 
criterion, 
group receive 


———=—————— 
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significantly (x^. = 15.80, p < 001. Appli- 
cation of the Nemenyi test (Miller, 1966) to 
the means of the ranks for each project indi- 
cated that the affective-experiential group 
received significantly higher rankings than 
either the cognitive group (p < .05) or the 
control group (p < .01), whereas the cogni- 
tive group was not significantly different from 
the control group. This analysis, which allows 
comparison among the three housing projects 
across criteria, provides results consistent with 
those obtained from the within- and between- 
project comparisons above: the affective- 
experiential group achieved superior perfor- 


mance during the study year relative to the 
control projects, 


Discussion 


The results indicate that the day-to-day 
performance of police officers trained with 
affective-experiential methods was generally 
more efective than (a) the level of perfor- 
mance of officers who previously worked in the 
same project; (5) the level of performance of 
officers working in a similar housing project 
(five ninths of whom were newly assigned offi- 
cers who had received cognitive training); 
and (c) the performance of experienced, un- 
trained officers in a similar housing project. 

Two considerations make the findings more 
striking. First, the affective-experiential-train- 
ing group leaders were instructed not to 
touch upon police matters that related to the 
Specific performance criteria subsequently 
employed in the study. Since mental health 
professionals might have ideas about appro- 
priate police behavior at variance with the 
instructions the recruits were receiving at the 
Academy, challenging these instructions might 
have resulted in confusion and danger for the 
recruits in the performance of their 
(Zacker & Bard, 1973), 


The second consideration is that the train- 
ing program for the cognitive-training recruits 
was intended to be of 4 qualitatively high 
level. It was, in fact, well received, and clearly 
not regarded as a dull, unrewarding expe- 


rience for most of the cognitively trained 


recruits. For example, it was common for the 


recruits to engage in class discussions (some- 
times quite heatedly) and to Seek out instruc 


duty 
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tors during breaks. Indeed, when their eval- 
uation of the training was obtained by 
anonymous questionnaires administered at 
the conclusion of training, 68% of the cogni- 
tive-training recruits gave unconditionally 
affirmative replies when asked if future groups 
of police recruits should receive the same kind 
of training. Further, 58% of the cognitive- 
training recruits felt that even experienced 
patrolmen (whom recruits typically seek to 
emulate) should receive this kind of training. 
The corresponding figures for the affective- 
experiential-training recruits on these issues 
were 91% and 86% (Zacker, 1972). Prior to 
the start of training, the staff decided to make 
the cognitive training as relevant and interest- 
ing as the lecture format would permit. Had 
cognitive training instead been as dull as 
Police Academy lectures typically seem to be, 
the year 1970 performance in the cognitive- 
training group might well have been poorer 
than actually occurred. , 
The data suggest that the affective-experi- 
ential training promoted a sense of com- 
petence and motivation which generalized to 
total job performance. Further study is 
needed to determine whether other officers 
in other organizations receiving equivalent 
training would obtain similar results. 
Engaging in action research usually 
requires that the investigator make some 
compromise between experimental precision; 
on the one hand, and the realistic constraints 
imposed by the setting, on the other. In il 
present study, for example, it was intende 
that the four senior cognitive-training officers 
also receive training. Because of manpower 
needs, however, the Police Department. was 
unable to release these personnel for training 
with the cognitive-training group recruits. " 
a result, not all cognitive-training personne 
had received training prior to the study et 
This partial failure of control, althoug 
unavoidable, may have accounted for some 0 
the differences in performance data. - 
It is also apparent that the en 
period for  affective-experiential ond 
resulted in their having more extended ined 
ing than did the cognitive-training suco 
This inequity was not intended to "stack ! ? 
deck" in favor of the affective-experient!4 
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training group but, rather, was designed with 
two important considerations in mind. The 
first concerned the typical organizational 
response by police departments to an inno- 
vation; that is, to regard it merely in terms 
of adding several hours of classes on the 
subject during recruit training. Such training 
by department members is conducted usually 
in a lecture format rigidly controlled by an 
outline. Tt was intended that cognitive train- 
ing follow such traditional police training 
(i.e., lectures during Academy training with- 
out extended, follow-up in-service training). 
Had there been as appreciable an improve- 
ment in performance in the cognitive-training 
group as occurred in the affective-experien- 
tial-training group; police departments could 
then have easily integrated the new 
“material” without departing from their con- 
ventional teaching methods. Hence, determin- 
ing whether this was possible was a key 
purpose of the study. 

‘A second consideration was the expectation 
the cognitive-training officers 
lectures after assignment 
to projects, there might have been unfor- 
tunate consequences. Because of its lack of 
relevance to the immediate concerns of work- 
ing police officers in the field, the use of E 
going in-service cognitive-training lectures was 


believed to offer the prospect that perfor- 
be adversely affected in 


that had 
returned for weekly 


mance would y 
cognitive training. While this might ine 
increased the differences in permian. 

ve-experiential- and cogni- 


between the affecti 
tive-training group 
tive-training group performan 
seriously affected the communities ! Tiis 
the disaffected officers were functioning. L^ 


issue highlights one of the dilemmas of s 
research: the necessity of preserving à d 
commitment when it is counter 
empirical integrity of the research. en 
To have provided cognitive-tra ^o o : E 
extra training during the recruit p -— 
that their total specialized training Mose 
that received by the affective-exper! a 
Officers might at first glance seem to Rc 
the issue, However, while such à proce Lr 
Would have offered some control = e 
length of training, it would have necessità 


s, adverse effects on cogni- 
ce might have 


ities in which 


to 


20; 


the creation of two separate Academy train- 
ing programs, one for affective-experiential 
training and one for cognitive training, 
thereby reducing the level of similarity of 
Academy experiences from the level achieved 


in the current study. 
An important consideration should be 


emphasized regarding the two training 
methods. Police training is characteristically 
like the cognitive-training approach, which 
can be seen as a “military—vocational model." 
This approach, geared to the training of 
technicians through the transmission of infor- 
mation to large groups in a lecture format, 
minimizes individual decision making by the 
practitioner. It may be that the passivity 
fostered in the recipient of such training is 
countervalent to the activity required in 
effective police work. Hence, the policeman— 
practitioner may be seriously compromised 
in both his ability to process information 
received while in a passive mode and in 
translating it to the active mode required by 
his daily functioning. 

The affective-experiential approach, which 
was more effective in the present study, can 
be seen as a “professional model," in which 
emphasis is on the exercise of individual 
judgment in the delivery of complex human 
services, In this model, the active learning 
mode is consistent with the mode required in 


subsequent on-the-job functioning. 
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interviewing skills of cultu 
manpower agencies were ram 
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The employment interview continues to be 


a central procedure in personnel selection in 
spite of repeated discussions of its limitations. 
It is therefore important that the applicant 
expected in an interview and be 
elf, his skills, and his 
effective Way possible. 

antaged person often 
yment interview. 
resentation 


know what is 
able to present hims 
experience in the most 
The culturally disadv: 
performs poorly in the employ 
He is characterized by à passive presentation 
of self; specifically, he lacks facility S 

‘tiveness, and spontaneity In is 
ith the socially distant inter- 
, the disadvantaged appli- 
nappropriate to talk 
]s to understand the 


language, asset 
interactions W 
viewer, In addition, the 
cant often feels it is ! 
about himself, and he fails t í mie 
rationale behind the interviewers questions. 


i er- 
The employment interviewer frequently H 
; A ive & 
Ceives him as uns punc 


killed, ues: ersonal 
suited to both the tec d interp 


hnical an 
demands of complex vocational roles. 


E yhat inter- 
T 3 needs to know Ww 
he applicant s to know why yes oF 


s Colorado 
iced by the 
ondu y under a con- 


. This research was c 
e University Manpowe 
ract from the U. S. Depar 

* Requests for reprints should be sen 
Barbee, 12550 West Virginii 
Colorado 80228. 


greater positive change Íroi 
among the combined-treatment group. 
analysis of variance showed that the change 
ation item concerning 
atistical significance. The results suggest 


program designed to enhance the job- 
rally disadvantaged persons. Clients from three 
domly assigned to one of three treatment condi- 
program of videotape-feedback and behavior- 
= 24), (b) a videotape-feedback-only program (; 
tment (control program (n=19). The results 
10 measures of observable interview behavior, 
as judges in randomly presented videotaped 


m the initial to the final in- 
Moreover, on two of these 
was statistically sig- 
the probability of 


substantially increase the 
able employment. 


no answers are usually inadequate and how to 
convey information about skills, experience, 
competence, and job interest. In short, he 
needs to know how to sell himself legitimately 
and effectively. 

Three training modes were identified: 
videotape-feedback, behavior-modification, 
and microcounseling techniques. The first, 
videotape feedback, has been traditionally 
employed in psychotherapy, and the medium 
lends itself equally well for studying the inter- 
view behavior of disadvantaged job appli- 
cants. Moreover, it is practical in the actual 
conduct of a training program. Previous 
research in psychiatry, typified by the work 
of Wilmer (19672, 1967b), suggests that the 
advantages of videotape include allowing 
people to develop skills of careful observation 
and to appreciate the meaning and appropri- 
ateness of all the dynamics of human inter- 
action. Stoller (1965, 1966) introduced the 
idea of focusing patients’ attention on be- 
havior patterns. Similarly Geertsma and 
Revich (1965) directed patients’ attention to 
specific cues and behaviors in therapy sessions 
and reported that patients repeatedly shown 
their videotaped interviews exhibited im- 


proved direction, structure, and focusing. 
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Implicit in the work of these researchers is 
the assumption that by utilizing videotape, it 
is possible to identify and strengthen positive 
(facilitative) behaviors and change the non- 
facilitative behaviors. Once specific behaviors 
which may need altering are identified, be- 
havior modification—the second major train- 
ing mode—is one approach for bringing about 
desired behavioral change, 

The use of behavior modification in psycho- 
therapy from 1940 to the early 1950s 
appeared to focus on global aspects such as 
personality change, Herzberg (1947 ) and 
Rado (1951) came to regard therapeutic ses- 
sions as mere practice for improving be- 
havioral Tesponses to others. In the early 
1950s, investigators began to feel that, impor- 
tant as general therapist-patient relationships 
are, they are not as amenable to immediate 
investigation and application as are individual 
behavioral units. The applications of behavior 
modification now range from retraining per- 
sons with character disorders to counseling 
and social work (Kanfer & Phillips, 1970) to 


Bandura’s (1969) important work on model- 
ing and imitation. 


Much the same observation can be made 


of the third mode of experimentation, Micro. 
counseling (Ivey, 1971) is similar in concept 
to focusing and Specificity in the range of 
stimuli which may be attended to, 


: METHOD 
Subjects 


in this study were 64 enrollees from 
t agencies Serving the disadvantaged 
in the Denver metropolitan area, All subjects were 

training programs such as clerical, 
nursing (LPN), welding, and many 
ect expected to be actively seeking 
in two months, upon completion 


altered, rehearsing roles, 
approximate behaviors), and 
interview (5 = 24); (b) a video Pro; i i 

* 4 ram i 
Subjects (n=21) viewed a vid 5 2 watch 


interviews and then completed 
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(c) a no-treatment program in which subjects 
(1 — 19) completed initial and final interviews with 
no relevant intervening experience. 

All subjects went through the following sequence: 

1. Each subject filled out a general application 
form for a job for which his training and/or 
experience would qualify him. The subject also wrote 
u “help wanted” ad as it might appear in a news- 
paper in as much detail as he thought necessary or 
appropriate. 

2. The ad and application form were given to a 
research personnel interviewer, a person who pre- 
viously had been a Personnel interviewer for a major 
company. The interviewer was asked to limit the 
interview to about 10 minutes and to ask for slightly 
higher skill requirements than were specified in the 
want ad so that the subject would have to sell him- 
self. 

3. The subject entered the room and was greeted 
by the interviewer whom he had never seen before, 
and the interview was conducted and videotaped. 

4. The procedure for each of the three experimental 
groups varied as follows: (a) Control subjects were 
asked to wait while Preparations were made for their 
final interviews. No treatment was provided. The 
waiting period was approximately 30 minutes. (b) 
Subjects in the video group were shown their initial 
taped interviews without comment and then asked 
to wait for their final interviews. Again, the waiting 
period was approximately 30 minutes. (c) Subjects 
in the combined videotape-feedback and behavior- 
modification. programs were instructed to approach 
the viewing of their just completed interviews with 
the following set: 


We want you to watch your interview very care- 
fully and look for specific things you said or did 
that you think should be changed in the next 
interview. For example, you might want to learn 
to talk louder or present a facet of your training 
in a different way. Write down a list of specific 
things you want to learn to do. 


The initial interview was then shown to each com- 
bined subject. The trainer helped him develop his 
list, adding items he had ignored if he agreed. A 
behavior change schedule was then developed for 
each subject consisting of at least four but no more 
than six specific items that the subject wanted to 
change. A sample behavior change schedule item 
might be, “Clarify the exact kind of work you did 
on your last job.” Each behavior to be changed was 
taken in turn. The trainer Suggested alternative 
ways of responding, giving his reasons each time 
based on his general knowledge of what interviewers 
look for in interviews. An effort was made in the 
case of each item on the behavior change schedule. to 
develop three or four alternative responses meeting 
the approval of both subject and trainer. By using 
natural mannerisms such as being attentive, poanie 
of the head, smiling, or commenting, “good,” pein 
imate behaviors were reinforced. In addition, su d 
jects were encouraged to inject job-relevant ques 
tions into the training session. 
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AND CORRELATIONS OF MEAN PERFORMANCE CHANGE 
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B = the videotape-treatment group, and C = the no-treatment 


tions dealing with job skills and personal adaptability 
and a final question that asks the judge to rate the 
“probability of hire” of this applicant if an appro- 
priate job existed in his work setting. Each item was 
cast into a seven-response Likert-type format. Differ- 
ences between the initial and final interview ratings 
by the judges were analyzed using an analysis of 
variance and a t test for unequal variance techniques. 


RESULTS 


Prior to viewing the videotapes, judges were 
asked to indicate the importance they 
attached to each of the interview dimensions 
used on the scale in considering applicants in 
general for entry-level employment, According 
to their responses, interviewers placed rela- 
tively more weight upon verbal communi- 
cation and interpersonal skills than on job 
skills. In addition, they placed relatively less 
weight on overt expressions of anxiety, in- 
ting their awareness that the interview is 


dica 
anxiety provoking. 
Comparing the three experimental groups' 
initial-final interview change in “specific 
the data 


behaviors” (Part Ds 
show the following: (a) Those subjects who 
viewed their initial videotapes but did not 
receive the behavior-modification training 


interviewing 
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program (video subjects) did not improve sig- 
nificantly more (Column B x C) than did 
those subjects who received no intervening 
program (control subjects), (b) Those sub- 
jects who received the combined-treatment 
program of videotape feedback and behavior 
modification (combined subjects) improved 
significantly more on two of the items (level 
of questions asked about job tasks and condi- 
tions, and assertiveness and initiative) and 
Part 1 total compared to video subjects 
(Column A x B). (c) Comparing combined 
subjects to control subjects (Column A x ie), 
the former improved significantly more on 
three of the items (level of questions asked 
about job tasks and conditions, assertiveness 
and initiative, and ability to respond to the 
interviewer’s questions) and Part 1 total. 
There were no significant differences among 
the three groups on the two items, job skills 
and personal adaptability. On the probability 
of hire item, which is by far the most impor- 
tant, the combined subjects improved signifi- 
cantly more than did the control subjects. 
Although many of the individual items failed 
to reach statistical significance, it should be 
noted that with the exception of one item 
(honesty and openness), there was greater 
positive initial-final change for the combined 
group than for the video or control groups. 

In addition to comparing the initial-final 
performance change across the three experi- 
mental groups, a number of subject character- 
istics were examined to determine if these 
characteristics moderated the amount of 
initial-final performance change within each 
group. Table 1 also presents the correlations 
of initial measures to initial-final performance 
change. Generally, the lower the initial per- 
formance of the combined and video subjects, 
the greater the positive change in interviewing 
performance and, conversely, the higher the 
initial performance, the lesser the change. 

The Culture Fair Intelligence Test (Cattell, 
1960) was administered to all subjects prior 
to their participation in the program. Positive 
initial-final change on the Part 1 total which 
contains specific interviewing skills is signifi- 
cantly correlated to intelligence for the video 
group but not for the combined group. On the 
prebability of hire item, both the video and 
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combined groups’ intelligence test scores are 
significantly related to positive change. 


DISCUSSION 


The results of this study show that the 
combined program of videotape feedback and 
behavior modification produced statistically 
significant changes, as judged by potential 
employers, in the interviewing ability of dis- 
advantaged job applicants. While videotape 
feedback alone was not sufficient to bring 
about statistically significant change, trends 
within these data are in the positive direction. 
From a practical point of view, videotape 
feedback was not intended to be used as an 
independent training program. Its function, 
rather, was to present an accurate representa- 
tion of the initial job interview and to allow 
the applicant and trainer to focus upon 
specific behaviors that should be improved 
within the interviewing interaction. In addi- 
tion, videotape may well have contributed by 
creating interest, by reducing client defensive- 
ness, by allowing a more objective exami- 
nation of the interview, and by helping to 
generate meaningful discussion. 

It is important to emphasize that the com- 
bined program stimulated and guided the 
disadvantaged job applicant to become a more 
active participant in his interview. Increased 
assertiveness impressed the employment inter- 
viewers in the research evaluation favorably 
and increased their ratings of the applicant’s 
suitability for employment. Disadvantaged 
persons who gained assertiveness and self- 
confidence through learning specific inter- 
viewing skills in the program are potentially 
better prepared for actual interviews when 
they begin to seek employment. Not only are 
they able to present themselves more effec- 
tively to the interviewer, they are better able 
to objectively evaluate their own performance 
in terms of the criteria learned in training. 
They are able to rationally consider areas of 
needed improvement without blaming them- 
selves or the interviewer or having depressing 
feelings of failure, With this more objective 
Viewpoint, they are willing to pursue more 
interviews in spite of early unsuccessfu 
experiences, 


EXPERIME 


a were not available to 
evaluate the amount of transfer from the 
interview training to actual job-seeking be- 
havior, Subsequent studies will be required to 
determine whether interview training pro- 
motes increased job interviewing and acquisi- 
tion of appropriate employment. 


Unfortunately, dat 
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Four experimental tasks were used to test the effects on procedur: 


mance of providing special instructions on logical-tree construction an 


al perfor- 
d use and 


of limiting versus not limiting the time available for studying the task instruc- 


tions. Results indicated that 
either one or both logical-t 


they chose, It w; 


performance accuracy was statistically better when 


tee instruction and practice was provided and the 
task instruction study time was limited, than wj 


to study the task instruction ior as long as the 


hen subjects were permitted 
y chose and in whatever way 


as concluded that quite simple procedures for familiarizing 
subjects with logical-tree operations can impro 


ve performance on procedural 


tasks. It also appeared that placing a limit on the time available ior study 
of instructions can be better than permitting unlimited time. 


Recent research (Davies, 1967, 1970; 
Jones, 1968) has indicated that complex pro- 
cedure-following (or rule-following) behavior 
is markedly improved by a presentation tech- 
nique involving logical trees. Moreover, 
human factors specialists (e.g., Colwell, 1971; 
Colwell & Risk, 1971) have already capital- 
ized on logical trees in programs directed 
toward the improvement of performance in 
operational work situations (e.g., maintenance 
jobs). 

Tree-structured data have characteristics in 
common with road maps, flow diagrams, and 
Organizational charts of ‘the personnel of a 
company. As applied to procedural tasks, 
logical Consist of separate 
tion for action and 


y by mapping inter- 
A ns g the various statements of 
condition" and “action.” Logical trees (alter- 
nately called "decision trees") express rules 
Such that only those c 


In view of the demonstrated value of logical 
trees to procedural Performance, it was 
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thought that performers of procedural tasks 
might benefit if they would transform instruc- 
tions from the form in which they arrive (if 
other than logical trees) into a logical-tree 
format, To this end, the present research was 
designed to investigate the possibility that 
learning and performance on procedural tasks 
can be improved by providing instructions to 
subjects on the construction and use of 
logical-tree representations of procedural 
instructions. As an adjunct to this purpose, 
another goal of this experiment was tO 
examine the effects of self-paced versus 
externally paced study on procedural learning 
and performance. 


METHOD 


To obtain base-line data and to prepare the sub- 
jects for the logical-tree instruction, all subjects were 
given a pretraining task (Task 1). Following pre- 
training, those subjects in the logical-tree group ELE 
instructed to use the logical-tree method in their 
approach to subsequent tasks (Tasks 2, 3, and 4) 
and were given a brief description on how to do 
this. The remaining subjects (the free study group) 
were simply instructed to develop their own approac a 
to the performance on Tasks 2, 3, and 4. polona 
all pretraining and instructions (given in that order)» 
each subject performed Tasks 2, 3, and 4. T 

The four tasks (described briefly below) bs 
presented in the same order for all subjects; ED d 
balancing of task presentation order was not scie 
because a special interest in the effects of the ie 
variables on Task 4 (a synthetic communicato ni 
system) dictated placement of Task 4 in the ee 
Position. The presentation order of the other t 
tasks was decided arbitrarily. 
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Half of the subjects from each of the logical-tree 
and free-study groups received a fixed amount of 
time (20 minutes) to read the instruction manual 
for each task (the limited time group) and the 
remaining subjects were allowed unlimited manual 
study time (the unlimited time group). All subjects 
were told to rely on their notes and memory of the 
task instructions in answering task problems, since 
the instruction manual would be taken from them 
prior to testing. 

The resulting experiment was a 2X2 factorial 
design, with two instructional techniques (logical 
tree vs. free study) and two manual time treatments 
(limited time vs. unlimited time). Each of the four 
resulting treatment combinations was administered 
to one of four different five-member groups. The 
influence of the experimental treatments was tested 
uring the subjects performance on each of a 


by meas 
different tasks (in addition to the 


sequence of three 
pretraining task). 


Subjects 

The subjects were 20 male students recruited from 
Florida Technological University and paid $1.75 per 
hour for their services. Subjects were randomly 


assigned to experimental groups. 


Materials 


Four sets of typed instruction manuals and 


corresponding test sheets for Tasks 1-4 were pre- 
1-3 were constructed 


sented to each subject. Tasks. 

especially for use in this experiment. Task 4 was an 
abbreviated version of a synthetic communi re 
system, with close similarity to actual, real-worl 


i dium for the 
Sys vhich was developed as a me 
che in earlier efforts under the 


conduct of research 
present program (see Bernstein & Gonzalez, da i 
The four tasks were similar to one anoen E A 
each required the subjects to select E course 0 pets 
given certain conditions. “Correct conde s = 
contingencies were defined by rules containe zi As 
instruction. manuals. The four tasks differe : 
d nature 0 


: ai in the number an t 
cach other mainly in Tila provided by 


conditi ourses of action, OV 
each nct as indicated in the following: x 

The courses of action for Tasks 1, 2, 3, and ik 
20 names (e.g. “Mary”), 6 instructions Der RT 
to building 2173," etc.), 16 letters of the BY fm 
and 5 modes of communication, ad "i n 
respective conditions for Tasks 1, 2, 3, ant (e.g. 
combinations of 5 colors and 4 cioe pe 
greater- than 80") of printed numbers an cn 
acteristics of names (e. “beeing ven ea): 16 
6 personal characteristics (e.£ left an ositions”; 
Þlaying cards, 4 “priorities,” an 4 slog ue system 
and 4 characteristics of # Soe ae pscribers 
(language and frequency capabilities o ET peer 
to the system, availability of eha roel prob- 
Priority), The respective numbers © 


30. 
lems for Tasks 1-4 were 16, 10, 80, and 30 


Tasks. 


For a problem in Task 4, for example, the partic- 
ular communication mode proper to use might 
depend on whether the receiver's channel was open 
and whether the sender and receiver of the com- 
munication had common language and frequency 
capabilities. 

Logical-tree instruction. Subjects in the logical-trec 
group were presented, in printed form, the following 
information: 


To facilitate your speed and accuracy on the 
tests you will be given, you are asked to construct 
“logical trees.” 

To construct a tree, first read through your 
manual and try to get an idea of how various 
combinations of conditions lead to different 
actions. For example, you may want to know the 
conditions under which a person may or may not 
be drafted. Some of the conditions relating to this 
problem are whether the person is at least 18 but 
less than 28 years old, whether he is married, 
whether he is in school, whether he is medically 
able, etc. Corresponding actions are that the per- 
son is drafted, is medically deferred, etc. State the 
conditions as questions which can be answered 
“yes” or “no.” Next, draw “yes” arrows and “no” 
arrows leading from each question to the next two 
questions and/or actions. So, for the example given 
above you start by asking: Is the person between 
the ages 18-28? A “no” arrow is drawn from this 
question to an action item which, in this case, 
would be “exempted from draft.” A “yes” arrow 
is drawn from this question to another question 
such as, Is the person in school? etc. It is often 
convenient to trace “no” answers to the right and 
“yes” answers down. Following is an abstracted 
example of a logical tree. It may be modified to fit 
any given problem. [An abstract drawing of a 
logical tree was shown to the subjects. ] 


Along with this printed instruction, the subjects 
an oral briefing relating essentially the 


were given 
contained in the printed instruc- 


same information 
tion. 


Procedure 

1. Logical-tree condition. Subjects were given 
Task 1, logical-tree instruction, Task 2, Task 3, and 
Task 4. The subjects worked continuously through 
each task with a six-minute break between successive 
tasks. E 3 
2. Free-study condition. Same as the logical-tree 


iti only no logical-tree instructions Were 
deli i d to start on Task 2 
5 ht ster re encouraged to develop 
dele own d performing the 


tasks. - 
3. Unlimited-time c 
as much time as they 
al for each of the four t: 
ited-time condition. Sub. 
of the ins! 


ondition. Subjects were allowed 
desired to study the instruction 


asks. n 
jects were permitted 


truction manual on 


manu: 

4. Lim 
20 minutes for study 
each of the four tasks. 
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TABLE 1 
MEAN Scores ox Tasks 1, 2, 3, anp 4 : P 


— c ed 
| Task 1 | Task 2 Task 3 | Task 4 
| m oe, ELI T - 
Group Limited | Unlimited | Limited | Unlimited | Limited Unlimited | Limited | Unlimitec 


i | i i i ime time time 
time time | time | tme time time 


x percent error scores 


Free study | 26 38 36 38 


3 21 2: 
| : 3 20 | 28 
Logical tree | H 28 42 26 | 6 | 3 pe Br ee 
P X problem performance time scores (seconds) , E 
" EE 5 E id j E 35 528 661 
Free study | 216 359 235 214 487 385 | s 2 
Logical tree 243 220 328 223 453 368 766 


X manual study time scores (seconds) 


1200 
1200 


Free study 
Logical tree 


1200 674 
1200 746 


RESULTS AND Discussion 


The effectiveness of logical-tree instruction 
was confirmed by examining subjects’ notes 
on Tasks 2, 3, and 4. A subject was considered 
to have used the logical-tree method if his 
notations contained statements of conditions 
and actions from the task instructions and if 
Such statements were interconnected with 
arrows labeled “yes” and “no.” All 10 sub- 
jects who received logical-tree instruction used 
the logical-tree format for their notations, 
whereas the notes of only 1 of the 10 free- 
Study subjects resembled the logical-tree 
approach. The other free-study | subjects 
simply transcribed major sections of the task 
instructions, retaining most of the 


organizational structure and makin 
notation variations. 
following 


original 
g only 
More particularly, the 
Characteristics were frequently 
found in the notes of free-study subjects and 
were seldom or never found in the notations 
made by logical-tree subjects: (a) information 
was recorded which was not essential to the 
solution of task problems; (5) information 
was not recorded which was essential to solv- 
ing task problems; (c) connectives (e.g., or, 
unless) were employed to relate task condi- 
tions; and (d) the same task condition was 


809 
1037 


624 
1226 


1200 | 
1200 


1200 
1200 


939 
1299 


than one action. 

Mean scores for percent error, problem 
performance time, and manual study time on 
Tasks 1-4 are shown in Table 1. Percent error 
was computed separately for each epin: 
mental group by dividing the total number 0 
incorrect trials on a task by the total A 
of incorrect trials possible on that task. (/ 
trial consisted of one in a series of problems 
which were solvable from the information 
contained in the associated instruction man- 
ual.) Manual study time is the total duration 
of study on the task instruction manuals by 
all subjects in a group divided by the number ? 
of subjects in the group. Problem perfor- 
mance time is the mean time required by 
subjects to perform all problems on a task. " 

All scores were analyzed separately for eac 
of the four tasks. In addition, percent T 
scores and problem solution time scores Ga 
Task 3 were combined with corresponding 
scores on Task 4 and analyzed together. E 
rationale for combining these scores Lim 
analysis was: (a) Tasks 3 and 4 showed. x 
same kinds of effects (i.e., no saree 
were obtained between either error or pionen 
solution time scores on Tasks 3 and 4 = É 
any other variable). (5) It appeared that 


repeated where it was associated with more | 
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most sensitive measure of experimental effects 
should be employed. (The rudimentary form 
of the logical-tree instruction and the brevity 
of practice on the logical-tree method seemed 
to represent far from optimal conditions for 
developing good logical-tree processing 
techniques.) 

Analyses of variance of 
performance time scores and ¢ tests of manual 
study time scores on Task 1 showed no sig- 
nificant differences between either logical-tree 
and free-study subjects or limited-time and 
unlimited-time conditions. This suggests that 
these groups had equal ability at the start of 


the experiment. 
Logical-tree subjects took significantly 


longer manual study time than free-study sub- 
jects on Task 2 (t = 249, df = 8,p < .05) 
and Task 3 (¢= 2.05, df = 8, p < .05) and 
also required significantly greater problem 
performance time than free-study subjects on 
Tasks 3 and 4 (F— 5.88, df = 1/16, p< 
.05; on the combined analysis of Tasks 3 and 
4). However, logical-tree subjects had lower 
error scores on Tasks 3 and 4 than free-study 
subjects (F — 6.71, df = 1/16, P «085; F= 
449, dj = 1/16, ? < 05; on the separate 
analysis of Task 4 and the combined analysis 
of Tasks 3 and 4, respectively) and their 


manual study time improved to approxima 
equality with free-study subjects on Task 4. 
Thus, 


logical-tree subjects showed pus 
g 

sut 3 ib- 
what of an initial disadvantage and su 
sequent improvemen 


t to the points of equality 

i iori in 
(in manual study time) and superiority ( 
free-st 


accuracy) relative to udy subjects for 
lowing presentation of the logical-tree ins pe 
tion, This would appe an init 
ineptitude but à later increas 


gained through practice; in using pe 
new logical-tree method. Until some ta 


ical- 
with logical trees developed, use of m 
tree method did not help accuracy 


actually harmed manual study tme m 
The absence of counterbalancing 


zes it impos- 
in this experiment, however; pn 
sible to determine the extent ie to practice 
observed trends are attributable p i 


with the logical-tr 
ation in the nature 0 
Conclusions from 


error and problem 


1968), however, support the “practice” inter- 
pretation. 

The slower problem performance time of 
logical-tree subjects on Tasks 3 and 4 of the 
present experiment may be due to the pos- 
sibility that free-study subjects had access to 
less of the required information during testing 
which predisposed them to quickly dispense 
with more test items. Thus, in addition to aid- 
ing the problem-solving process during test- 
ing (as demonstrated by the earlier research) 
the use of the logical trees in the present 
experiment may have assisted in the infor- 
mation-recording process during manual 
study. 

Neuman-Keuls tests showed that error 
scores on Tasks 3 and 4 combined were signifi- 
cantly inferior (P < .01) in the free-study— 
unlimited-time condition to any of the other 
three condition combinations. The limited- 
time condition did not result in better 
accuracy than the unlimited-time condition on 
Tasks 1 and 2, possibly because motivation to 
perform well was high enough to stimulate 
relatively efficient information processing 
without the additional influence of temporal 
limitation. Presumably, motivation decreased 
with time in the experiment such that only 
those groups receiving the apparent benefits 
of temporal (or logical-tree) “structure” (i.e., 
constraints placed on the learning process) 
were efficient information processors during 
Tasks 3 and 4. 

This finding of a learning decrement result- 
unstructured learning situation 
may be related to the observation commonly 
madé by students that their study typically 
becomes more efficient as "pressure" from 
approaching exams increases. When exams 
are “somewhere in the distant future,” the 
learning situation is like the unlimited-time— 
free-study condition; students may study for 
as long a5 they like and in whatever Way Me 
like. As exam time approaches, the time ava! d. 
able for study becomes more limited and, 


thus, their study situation becomes more like 
the limited-time condition. When time - 
sure becomes great enough, this tempora 
source of structure presumably results in me 
efficient information-processmg rites = 
strategies and, as in the present experiment, 


learning improves. 


ing from an 
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THE OPTIMAL USE OF SIMULATION * 
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The problem of when to stop training on a simulator and the advanta, d 
disadvantages of simulator fidelity are discussed. A study Mlistrating th 
paradigm that might be used to investigate these variables is described $ » 
suggestions are made for the use of simulation training. nu a 


The purpose of this paper is to discuss some 
issues in the optimal use of simulators in 
training and to present some illustrative data 
and a research paradigm relevant to these 
issues. The two issues are “amount of 
simulator practice” and “fidelity of simula- 
tion” required for the transfer of simulator 
training to job performance. 

A major problem in the use of simulation 
training is identifying the optimal time of 
transfer from the simulator to the “real-life” 
situation. As Gagné (1954) has noted, trans- 
fer does not invariably increase with the 
amount of initial training on the simulator. 
Furthermore, there is some evidence that 
there is a point beyond which additional sim- 
ulator practice does not produce a concoml- 
tant increase and may produce a decrement in 
positive transfer. Examples are a study by 
Caro and Isley (1966) with helicopter flight 
melius ard dul Rd. 
operator training by Lefkow! z(l bh 
man (1957, 1960) and his associates (Fleis 3 
man & Ellison, 1969; Fleishman & Hee 
1954, 1955; Fleishman & Rich, 1963) have 
conducted a series of experiments on changes 
in the pattern of abilities contributing to ne 
ficiency on perceptual motor tasks as prac wi 
continues. Fleishman's studies find T 
abilities (such as spatial orientation an Pr 

to other tasks con 
ceptual speed) common to C scale fni 
tributing to performance during learning, 


the contribution of these abilities gee 
decrease with prac as 


tice on the E 
learned. Also, among the factors accoun ing 
“This wo y Grant AFOSR-69- 
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for significant variance in task performance 
were factors found specific to the task and 
uncorrelated with independently measured 
ewe ap Race factor accounted 
n sing proportion of the variance as 
practice continued, becoming a primary con- 
tributor to later performance. Fleishman hy- 
pothesized that this factor is, at least in part 
a function of habits learned in relation to the 
specific content of the task. In at least one 
other study, Fleishman (1960) identified a 
second task-specific factor that decreased in 
importance with practice on a pursuit task, 
He interpreted this as reflecting a “learning 
set or general task strategy” which may be 
instrumental to early performance and de- 
crease in importance as proficiency increases. 
Recently Hinrichs (1970) has replicated these 
findings of increasing and decreasing task- 
specific factors using a similar pursuit task, 
These studies appear to be relevant to the 
problem of transfer optimization and to the 
problem of stimulator fidelity. To the extent 
that the simulator situation contains elements 
inappropriate for successful performance in 
the operational situation, continued simulator 
practice leading to the increased dominance 
E «content-specific habits" may well result in 
posttransfer habit interference. It would seem 
that to optimize the use of simulation, trans- 


fer should occur after the general task strat- 
egy is learned and while general abilities 
common to other tasks are playing a major 
role, but before possibly interiering sim- 
ulator-specific habits become dominant fac- 
tors in simulator performance. 

In a previous study, Weitz (1969) used a 
verbal learning task that permitted the sub- 
ject to learn either by association or by posi- 
tion and then transferred him to 4 second 


wo variables 


situation in which one of these tv 
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(association or position) was negatively 
transferred and the other positively trans- 
ferred. He found that performance can be 
enhanced on the second task by allowing 
training only up to the point where the 
appropriate strategy (association or position) 
is learned but when the specific verbal content 
is not yet learned. This finding lends some 
support to the suggestion that overtraining 
on a simulator beyond the level at which task 
context or strategy has been mastered leads 
to a decrement in positive transfer to the 
operational situation, 

Similar results have been found in other 
transfer studies in which simulator fidelity is 
the central concern. There is some evidence 
that simple simulators, with little physical 
similarity to the operational situation, but 
incorporating task strategies similar to the 
real task, can produce significant positive 
transfer, Prophet (1966), for example, reports 
à study that compared the simulator effective- 
ness of an inexpensive photographic mock-up 
of an airplane Cockpit with that of an elab- 
orate trainer costing over $100,000. Subjects 
received the identical training jn cockpit 
Procedures, The group receiving mock-up 
training actually did somewhat, but not sig- 
nificantly, better than the trainer group on 
Performance in an aircraft. These and other 
findings (see Gagné, 1954, 1962) suggest 
that a great deal of beneficial training can be 
realized from devices with low physical 
fidelity, 

Since it appears that 
uations, a simple sim 
complex task will pro 
transfer to the real ta 


; in at least some sit- 
ulator of a relatively 
duce as much positive 
Sk as will a more elab- 
greater fidelity, one 
interaction effect 


or content-specific 


quently interfere with 
posttransfer performance. The hypothesis 


here is that overtraining on g simulator of 
greater * fidelity will lead to the greater 
transfer decrement. 


? Tt is assumed here that the 


te is less than Perfect 
fidelity. 


Another consideration must be taken into 
account when designing simulator training. 
Training psychologists (see, for example, 
Gagné, 1962: Prophet, 1966) have frequently 
indicated that the requirements of the opera- 
tional situation should determine the char- 
acteristics of simulator training. As Biel 
(1962) points out, it is the specifica- 
tions of the training objectives, in terms of 
desired levels of proficiency, amount of flex- 
ibility, and other considerations based on any 
analysis of job requirements, that must be 
taken into account in the design of the total 
training program. The simulator is only a 
part of the training system to be designed for 
particular requirements. The evaluation of 
any simulator training should employ these 
requirements as criteria, It is in this context 
that the amount of training and fidelity of 
simulation issues should be viewed. For 
example, some tasks such as "emergency 
procedures," require maximum transfer ‘on 
the first “real” trial. In such cases, extensive 
pretransfer training and high physical fidelity 
may be needed. In general, however, it is the 
degree to which the simulator task makes the 
same demands on the operator as the real 
task, rather than mere physical reproduction 
of the task or the content of the components, 
which may determine the degree of positive 
transfer that will accrue from the simulator 
(Prophet, 1966). 

With these issues in mind, we now turn to 
an illustrative study in which some possible 
interactions of amount of training and “sim- 
ulator fidelity” on subsequent transfer to 
“real task” performance were examined. 


METHOD 
Procedure 


A Thomas table-top collator, model T-8, was used 
both as the trainer-simulator and the real task. 
When the collator was used as a simulator, e 
three or five of the eight racks were operating. ans 
conditions included two levels of simulator pra i = 
(overtrained or not overtrained), two levels of Ta 
ulator fidelity (three or five collator racks), an i 
control condition, involving no prior practice. es 
sheets of 84 X 11 inch mimeograph paper filled S 
racks. For the “real” task all eight racks were xa 

The collator is basically a foot-hand-eye coor = 
nation task. A foot lever controls mechanical a 
that push the paper out of the racks. The opera 
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m 
z 
E 
= 
a 


S 


50} 


Fic. 1. Performance under e; 


f paper, combining the 


must then grasp the pages 0 
ether. The foot pedal 


pieces and stacking them togi 
determines the speed at which the paper is pushe 
out of the racks. As long as the pedal is depressed, 
the paper keeps coming at the operator. Thus, the 
operator can speed up or slow down the process by 
either keeping the pedal depressed continuously or 
intermittently pressing the foot pedal. 

One hundred subjects participated in the exper- 
with 10 male and 10 female subjects ran- 
five experimental conditions. None 
ad had experience operating the 


) used in this study. 


iment, 
domly assigned to 
of the subjects h 


equipment (collator j 
Each trial consisted of the time the subject 
of 18 sets of paper, 


required to complete the stacking 
each set consisting of either 3, 
proper top-through-bottom order and 


crossed on the tray. One trial of under 30 seconds, 
r than the max mum speed 


which is slightly longe! à 
the collator feeds out 18 sets, was considered the 
Criterion for simulator performance. In Conditions 


or 8 sheets, M 


5 
properly criss- 


2 3 
BLOCKS OF THREE TRIALS 


CONDITION 


TTA 5 RACKS (OVERTRAINED) 
77 1A 3 RACKS (OVERTRAINED) 

II 5 RACKS (NOT OVERTRAINED) 

I 3 RACKS (NOT OVERTRAINED) 


`, 
* P 
Am mmm 111 8 RACKS (CONTROL) 


4 


ach condition on the transfer task (males). 


I and II, subjects collated with 3 or 5 racks op- 
, respectively, and were then transferred to the 
all 8 racks operating) upon reaching cri- 
simulator. In Conditions IA and IIA 
(3 and 5 racks operating, respectively), subjects were 
overtrained on the simulator task for 5 trials after 
the first trial on which they reached criterion. They 
were then transferred to the real task. Intertrial 
periods of 30 seconds in the simulator training, 
60 seconds in training on the real task, and 2 minutes 
between the termination of simulator training and 
the initiation of real task training were used. ^ All 
subjects were given 15 trials collating 8 racks for the 


real task. 
Condition IIT w: 
had no prior trainin: 


erating. 
real task ( 
terion on the 


ndition. Subjects 


as the control co 
for 15 


g before collating 8 racks 


trials. 


4 These intertrial interv 
experimenter time for scoring, 
F g 


stocking the bin. 


als were used to give the 
recording, and re- 


i) 
w 
t2 


` 
1 
* 
` 
1 
1 
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* TA 3 RACKS (OVERTRAINED) 


“TIA 5 RACKS (OVERTRAINED) 


BLOCKS OF THREE TRIALS 


Fic. 2 


RESULTS 


Although some of the results do not reach 
normally acceptable levels of statistical sig- 
nificance, all of the hypothesized outcomes 
were in the predicted direction for the males 
in the experimental conditions. That is, over- 
training on collating either three or fi 
(Conditions TA and ITA, respectivel 
slightly poorer performance on the 
(collating eight racks) than for th 
rable groups with no Overtraining 
T and II; see Figure 1). H 

Furthermore, for both the overtrained and 
not overtrained conditions, there was a trend 
for performance on the real task to be poorest 
when the simulator more closely approximated 


ve racks 
y) led to 
real task 
e compa- 
(Conditions 


- Performance under each condition on the transfer task (females). 


the real task. Thus, for each pretransfer con- 
dition, learning on five racks and then trans- 
ferring to eight racks led to poorer perfor- 
mance (not significant) on the eight-rack 
condition than did initial training on three 
racks. After Trial 6, the performance of 
the male controls (no simulator training) 
was, if anything, superior to groups receiving 
prior simulator training (p < .05). Perfor- 
mance decrement on the real task was max- 
imized for subjects overtrained on the tasks 
with the greatest simulator fidelity (2A ver- 
sus 3, p « .05). i 
As indicated in Figure 2, these results did 
not hold up for the female group. The per- 
formance of the females on the eight-rack 
control task was considerably inferior to that 
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of the male group. The primary finding for 
the females was that overtraining on both the 
three- and five-rack tasks improved real task 
performance over the control condition, but 
this was confined to the overtrained condition 


(P « 15). 


DISCUSSION AND CONCLUSIONS 


If these tentative, illustrative results ob- 
tained for the males are generalizable, it 
would mean that in a training situation on a 
simulator, trainees should not be trained be- 
yond the point at which they have reached 
some minimal criterion of performance. Not 
only could overtraining be wasteful in terms 
of time and money, but it may be detrimental 
to performance on the real task. It would 
also appear that, for at least some tasks, train- 
ing and overtraining males on a simulator 
which more nearly approaches the real task, 
but is not identical with it, leads to greater 
degradation of performance on the real task 
than training on a simulator of less fidelity. 
It may be possible that where the simulator 
has greater face validity, the learner will try 
to transfer inappropriate procedures. Perhaps 
collating three racks as contrasted with five 
racks is so different from the real task of 
collating eight pages that only the machine 
operation is learned and no attempt 15 made 
to use the same total procedure when trans- 
ferring to the eight pages. 

Can we explain the fin 
ferences did not hold up n 
Examining the modal number of trials to 
reach criterion prior to transfer, we fnd det 
tically no difference between the males i 
females. (Condition I mode = 2 for a 
and females; Condition IA mode=6 fo 
males and females; Condition II que 
for males and females; Condition IIA ag 
6 for males, 7 for females.) This ee dem 
that there was initially very little di ni fhe 
in the ability of the two groups to opera d 
machine when three or five bins are PT 
However, when eight bins are aig x 
(the control condition), the males doleas 
some superiority over the females. aw Ped 
be due to the fact that the span P les E 
hands is larger than that of the "Ps Toe 
this aids the males in collecting e els 


dings that the dif- 
for the females? 
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sheets after they are pushed forward by the 
collator. If a slightly different hand movement 
is required for collecting eight sheets than for 
five, and the males try to transfer their tech- 
nique of collecting the five sheets to the eight 
sheets, then habit interference would occur. 
On the other hand, if the females have to use 
a similar technique on the five sheets as well 
as the eight sheets (due to the size of the 
span of their hands), then the training on five 
sheets is a more appropriate simulation sit- 
uation for the females than for the males. The 
strategy learned is positively transferred for 
them. This speculation is somewhat supported, 
although not significantly, by the fact that 
in the early transfer trials (first three trials) 
the females do better when initially trained on 
five sheets (with or without overtraining) 
than when the initial training is with three 
sheets. 

It is important that we determine when in 
the real task we want to examine performance. 
For males, for example, no simulation training 
leads to relatively poor performance early in 
the real task but equally good or better per- 
formance later in this real task. To some 
extent, however, all of the simulation condi- 
tions improved performance on the first trans- 
fer trial over the control condition in which 
no prior training was given. 

Thus, it would appear that, in this case, if 
we are interested in “late” performance on the 
real task, simulation training does not have a 
very great advantage; on-the-job training is 
probably more efficient. If, on the other hand, 
consequence (such as in emergency ditching 
the very first trial on the real task is of great 


procedures); then simulation training may be 


worthwhile. 
]t becomes incumbent upon the trainer to 


know what criterion is important and to stop 
the training at the point which optimizes 
the appropriate performance on the real task. 
In general, it would seem that training should 
stop soon after the operation of the mecha- 
nism is learned and before habits specific to 
the performance on the simulator are de- 
veloped. Since overtraining may or may not 
be advantageous ina simulation situation, it 
is important then to determine what is being 
overtrained. It would appear that if a strategy 
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is well learned, but not the specifics of the 
training device, then this is not detrimental to 
ultimate performance. It would also seem that 
a training device may require less fidelity and 
briefer training periods than one might at first 
imagine. More research is needed to evaluate 
these possibilities, 


Conjecture 


If future research indicates that present 
hypotheses and suggestive findings have 
merit, one might extrapolate to a variety of 
training situations, For example, in T group 
or sensitivity training, could it be that the 
trainee learns not only to be open and to con- 
sider the motivation of others, but also learns 
(incorrectly) that others will be open and 
Sensitive to his motives? If so, this might 
lead to frustration outside of the training 
Situation. The same may apply when role 
playing is employed as "simulation? training. 
Perhaps role playing is not as effective as it 
might be, since not only are certain principles 
learned but particular roles are also learned 
in relation to particular individuals, If this is 
the Case, it might be wise to stress the prin- 
ciples, not the roles, and to stop the training 
once it becomes apparent that the strategy or 
principles are learned, that is, before individ- 


uals become wedded to particular perfor- 
mances, 
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TASK EGO-INVOLVEMENT AND SELF-ESTE 
D - EM AS 
MODERATORS OF SITUATIONALLY DEVALUED 
SELF-ESTEEM 


YOASH WIENER ! 


Case Western Reserve University 


Self-consistency theory predicts that when a person’s self-esteem is situati 7 
devalued in a personal, face-to-face manner, he will tend to lower pe fo Mesue 
whereas the protection of self-concept notion makes the opposite p tet 
increased performance. The present paper incorporates two aparate = seins e 
dealing with 2 factors—ego-involvement with task and chronic a cd 
order to investigate the conditions under which each of the above do 
consequences may occur. The design of each experiment was a 2 X 2 total 
Results indicated increased productivity in the high conditions of involvement na 
chronic self-esteem, suggesting the predicted protection of self-concept efectos 
however, productivity was not decreased in the low conditions, indicatin, A 
failure of self-consistency effects to emerge. i aa 


What happens when a person is in a situa- 
tion in which he believes he is being evaluated 
at a lower level than he deserves? Does he 
behave in a manner consistent with this 
evaluation, or does he try to protect himself? 

The present study investigates the effects 
of one form of a situational devaluation of 
self-esteem. This form involves a direct, per- 
sonal, and face-to-face deflation of a person’s 
self-concept. Two theoretical notions—seli- 
consistency theory (e.g. Korman, 1970) and 
the “protection of self-concept” theory (e.g.. 
Lawler, 1968)—make conflicting predictions 
about the performance implications of this 
condition. The major aim of the study is to 
reconcile the two conflicting predictions. 

Self-consistency theory predicts à decrease 
in effectiveness of performance as a con- 
sequence of this situational devaluation of 
self-esteem. This prediction follows from the 
general hypothesis that, all other things being 
equal, the less a person perceives himself to 
be competent, skilled, or qualified for a par- 
ticular task, the lower will be his performance 
on this task. By lowering his level of perfor- 
mance, he avoids inconsistency between his 
self-concept and overt behavior. However, 
there has been very little testing of this “sit- 
anna 
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uationally devalued self-esteem” hypothesis 
(Aronson & Carlsmith, 1962; Korman, 1970). 

The above self-consistency prediction of 
decreased performance is in direct conflict 
with the theory of “reaction to devalued self- 
esteem,” or “protection of the self-concept,” 
as it will be labeled in the present study. This 
theory has been suggested and supported 
mostly by equity theory investigations (e.g., 
Goodman & Friedman, 1968; Lawler, 1968; 
Wiener, 1970). 

The theory of protection of the self-concept 
suggests that when a subject’s qualifications, 
abilities, and skills are evaluated and declared 
to be low, he is presented with a threat to his 
self-concept and self-esteem. Protection of the 
self-concept under such conditions might be 
a likely reaction. The individual may try to 
convince either himself, his evaluator, or both, 
that he really is more effective than he is 
believed to be. One way to do that is to 
attempt to perform effectively and to do an 
especially good job. Hence, the basic predic- 
tion of the protection of the self-concept 
theory is increased performance when the 
qualifications of a subject are evaluated to be 
low. 
The basic question of the present investi- 
the opposing motivational forces 
of the situationally devalued self-esteem con- 
dition manifest themselves in performance 
variables. The present study will investigate 


possible conditions under which each of the 


gation is how 
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two seemingly conflicting processes—the self- 
consistency process and the protection of self- 
concept process—occur. Two sets of such 
conditions are investigated in two separate 
experiments to be reported in this paper. 

The first experiment investigated the vari- 
able of “ego-involvement” with the task as a 
possible moderator of performance con- 
Sequences of devalued self-esteem, Ego- 
involvement with a task can be determined by 
the value a person ascribes to the skills and 
abilities that the task requires. The more 
such abilities are valued, the higher will be 
the ego-involvement with the task. 

It is conceivable that under a condition of 
high involvement with the task, pressures for 
protection of the self-concept will be strong- 
est. Under low involvement, where such pres- 
Sures are low, it should be easier for self- 
Consistency processes to emerge. The main 
hypothesis, then, is that when self-esteem is 
devalued, increased productivity would occur 
under conditions of high ego-involvement with 
the task, whereas decreased productivity 
would occur under conditions of low task 
involvement, 

In general, Tegardless of self-esteem con- 
ditions, there should be higher productivity in 
the high-involvement conditions than in the 
low-involvement conditions. This is in line 
With previous findings (e.g., French, 1955: 
Kaustler, 1951) which indicate that level of 
performance on a task is increased by instruc- 
tions explaining that the task measures a 
highly valued ability, 

The second experiment to be reported in 
this paper investigated the variable “chronic” 
or “stable” self-esteem as a possible modera- 
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se-f-consistency effects, that is, reduced pro- 
ductivity, will occur for subjects low on 
chronic self-esteem. Protection of self-concept 
effects, that is, increased productivity, will 
occur for subjects high on chronic self-esteem. 

In both experiments, a performance vari- 
able of quality of work was included. Indi- 
cations from previous research (Wiener, 
1970) are that quality of performance will 
not be affected in any way by a devaluation 
of self-esteem. Both present experiments also 
included a number of cognitive-attitudinal 
measures as dependent variables, in order to 
find clues for possible alternative modes of 
resolution of the self-consistency dilemma 
and in order to check manipulations. 


EXPERIMENT 1 


Method 
General Design 


The basic design of the experiment was a 2 X 2 
factorial, which included the independent variables 
of situational self-esteem and involvement with the 
task. The two levels of the self-esteem variable were 
“devalued self-esteem” and a control condition of 
“unchanged self-esteem.” Subjects in each of these 
groups were further assigned cither to high-involve- 
ment or low-involvement cond'tions. 


Subjects 


Forty-eight male undergraduates in elementary 
Psychology courses were randomly assigned to the 
four conditions, thus providing 12 subjects per con- 
dition. It was explained that they would not be 
tested. themselves, but that their work would pro- 
vide information on a certain verbal task that was 
essential for a test development project, 


Task 


The task consisted of extracting five-letter words 
from a source (French prose passage) and perform- 
ing several verbal manipulations related to these 
words. Subjects were asked to perform all seven 
manipulations for the 35-minute task, and only the 
first five for the 8-minute pretest. The first opera- 
tion is typical: 

(A) Notice the word immediately preceding the 

five-letter one. If it contains two or more vowels, 

write Y in this column. Otherwise write N. ; 

The task was used and described in more detail 
by Wiener (1970). It was designed to be credible to 
both high- and low-involvement subjects, easy to 
learn, to be Sensitive to motivational changes, and to 
Yield simple and uncorrelated measures. of quantity 
and quality, 


to 
N 
- 
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intelligence. This aptitude is a crucial compo- 


Procedure 
nent of general intelligence. There is no need, of 
Upon arrival, all subjects were told that they had course, to overstate the importance of this 
to work for a few minutes on a specific task that aptitude for success in many areas of endeavor, 
was very similar to the main task. Tt was explained such as professional life, leadership, and ob- 
that in order to achieve valid analysis of the obtained viously, academic performance. A later part of 
information, subjects who performed extremely this project may be to correlate the results we 
poorly on the short task would have to be excluded obtain now with grades and other intelligence 
from working on the main task. They were further measures. Right now we are working on the 
content and administrative aspects. Here is 


told that usually very few people were excluded and 

that soon after the completion of the short task 

they would be informed about their performance. 
Each subject was then presented with a copy of 


another passage in French. The directions are 
the same as before, except this task will be 
longer and you are to perform all the opera- 


the French prose selection, the task instructions form, tions. [The experimenter points to the opera- 

and an answer sheet. He was instructed to locate tional specifications. The experimenter then gives 

each five-letter word appearing on the copy and to the subject 15 answer sheets.] Let me show you 

perform for each such word the first five operations the place where you will work. 

out of the seven listed on the instruction form. The Low-involvement condition. The next thing 
you will do is similar to what you have done 


subject was seated in another room and after work- 
ing for eight minutes was stopped by the experi- 
menter and asked to wait until the experimenter 


before. You already know that you are partici- 
pating in a test development project. What we 
looked over the work. are attempting to get is some measure of routine 
This procedure was intended to provide the con- clerical functions. Some of these functions are 
text for the devaluation of the self-esteem manip- typing, proofreading, filing, name and number 
ulation by enabling the experimenter to make com- checking, etc. While we admit that these func- 
ments about the subjects’ ability. It also provided tions are irrelevant for the average college stu- 
data about the subjects’ performance prior to the dent, they seem to be important both for sec- 
experimental manipulation, so that analysis of co- retarial training and for selection tests of junior 
variance would be possible. p secretaries, [From this point, the induction is as 
When the experimenter was back with the subject, in the high-involvement condition. ] i 
he proceeded with the experimental manipulations After the above inductions and explanations, the 
subjects worked for 35 minutes. At the end of this 


ss follows: time, each subject was told that he had completed 
Devalued self-esteem condition. Well, you have the task and was asked to fill out the employment 
not done a particularly good job. In fact, your planning survey questionnaire. This, it was explained, 
performance was poor, considerably below d would be used "to help us in planning and setting 
performance of the average person that we see up future employment for students." In this ques- 


here working in this short task. It is acceptable tionnaire, the subjects responded to 9-point rating 


for our purposes, but just barely. Now, GE i scales relating to attitudinal-cognitive variables. The 
your signature on this. [The experimenter Ban a variables investigated were (a) perception of capacity 
a participation form that included rating d to perform the tasks, (b) importance of doing well 
qualifications to the subject. Prominently circle on the job, (c) how worthwhile the task is, (d) 
was the poor category. The other begin difficulty of task, (e) capability to improve perfor- 
under the heading acceptable were Very good; mance, and (f) satisfaction with task. The first 
good, and fair. An additional heading was EUM two variables were used as manipulation checks ast 
acceptable. This was carried out in order to make the others as dependent variables. These measures 
the manipulation more salient for the subject. may give an indication of cognitive-affective modes 
Unchanged self-esteem (control) condition. of reaction to devalued self-esteem in addition to or 
Well, everything is fine. Now, IIl need your Sig- instead of task performance, 
nature on this. [The experimenter handed ihe When leaving, the subject was informed of the 
participation form to the subject. promised real purpose oí the project and the nature of the 
circled was the acceptable heading. None of i e experimental manipulations. 
specific categories very good, good, etc., were 
marked.] Results and Discussion 


d tion of the self-esteem manip- . h 
fen E peces n proceeded with the “in- Manipulation Checks 
he ex 


ulation, | o» induction. ; 
volvement with task” in ' Perception of capacity to perform the task. 
High-involvement condition. The re ous This is a check on the effectiveness of the 
you will do is similar to Mes Pom ae pairs instructions manipulating the subjects’ self- 
before. You already e ae fuel. What esteem by devaluating their perception of 
ticipating in a test dewhP measure of verbal their capacity to perform the task. The sub- 


we are attempting to get isa 
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TABLE 1 
MEANS OF NUMBER OF WORDS PROCESSED (PRopuc- 
TIVITY) FOR ALL CONDITIONS IN EXPERIMENTS 1 
AND 2 (STANDARD DEVIATIONS 
IN PARENTHESES) 


Devalued | Unchanged 
self-esteem | self-esteem 


Condition 


Experiment 1 


High involvement 80.75 64.50 

| (15.1) (13.9) 

Low involvement | | 60.58 59.83 
(9.1) (9.8) 

Experiment 2 

High chronic self-esteem | 78.00 64.93 
(9.5 | (16.1) 

Low chronic self-esteem 69.27 | 64.20 

(11.9) 


(16.4) 


jects rated themselves as to the degree to 
which they perceived themselves as capable 
of performing the task. A 2 x 2 analysis of 
variance showed that the main effect of the 
self-esteem variable was significant (df = 
1/44, p < .01). The mean capacity rating, as 
expected, was higher (6.96) for the unchanged 
self-esteem group than for the devalued self- 
esteem group (5.50). 

The importance of doing well on the job. 
This is a check on the involvement with task 
manipulation. Since they were told that they 
Were working on a test related to intelligence, 
it was expected that the high-involvement 
Subjects would rate the importance of doing 
your best on the job higher than the low- 
involvement subjects. This expectation was 
confirmed, A 2 x 2 analysis of variance on the 
data revealed a significant main effect of in- 
volvement (df = 1/44, p < 025), where the 
mean for the high-involvement subjects was 


higher than for the low-involvement subjects 
(5.87 and 4.29, respectively). 


Productivity Data 


Productivity was o 
the number of five- 
each subject. 

Table 1 (Experiment 1) shows the mean 
Productivity of the four treatment groups 
(N = 12 for each group). 

f A 2 X 2 analysis of variance on the produc- 
tivity data showed a significant main effect 


perationally defined as 
letter words processed by 


of involvement where high-involvement sub- 
jects produced more than low-involvement 


subjects (df = 1/44, p < 005). The inter- 


action of involvement and self-esteem was 
significant (df = 1/44, p < .05). Under the 
high-involvement conditions, devalued self- 
esteem subjects produced considerably more 
than the unchanged self-esteem subjects, 
while under the low-involvement condition, 
the devalued and the unchanged self-esteem 
subjects produced about the same. 


Quality Data 


The quality score was defined in percentage 
(errors made times 100) divided by total 
words processed. The correlation between pro- 
ductivity and quality scores was .04. A 2 x 2 
analysis of variance on the quality data did 
not show any significant effect, 


Cognitive-Affective Variables Data 


Out of the four cognitive-affective depen- 
dent variables (mentioned at the end of the 
procedure section), only the satisfaction with 
the task data showed indications that it was 
affected by the experimental manipulations. 
Satisfaction scores were obtained by responses 
to a 9-point rating scale, Analysis of variance 
of these data did not show any significant 
effect. However, the devalued self-esteem 
subjects did show a clear tendency to be less 
satisfied than the unchanged self-esteem sub- 
jects (df = 1/44, p < .10). The mean satis- 
faction rating for the devalued self-esteem 
subjects was 4.74 compared to 5.61 for the 
controls. 

The productivity results did not fully sup- 
port the experimental hypothesis. Tt was 
hypothesized that an interaction of situational 
self-esteem and task involvement would be 
obtained. The devalued self-esteem subjects 
under the low-involvement condition were 5 
produce less than controls while the eem 
self-esteem subjects under the high-involve 
ment condition were to produce more uen 
controls. The results showed a ap 
interaction, but the pattern of means (Ta $4 
1) was not as predicted. The devalued Ju 
esteem subjects under the cae ape í 
condition did indeed produce more than "y 
trols, but under the low-involvement conc? © 
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tions they produced about the same as the 
controls. These results indicate, as hypothe- 
sized, protection of the self-concept effects 
in the high-involvement condition, but fail 
to show any self-consistency effects. 

One possible explanation for the failure of 
the self-consistency notion to be supported in 
this experiment may follow from Korman’s 


(1971) suggestion 
a lower limit operating in the 
as to how poor performance 
can be, with this probably determined by general 
societal norms concerning the desirability of doing 
well and the implied threat of punishment (dis- 
charge) in the extreme low-performance situation 
[p. 45]. 
It is possible that such a “lower limit” of poor 
performance had been already reached by the 
low-involvement control subjects. If this was 
the case, self-consistency pressures could not 
have been manifested by a further reduced 
productivity of the devalued self-esteem low- 
involvement subjects. This possible explana- 
tion, of course, can be tested experimentally. 
The protection of self-concept notion re- 
ceived clear support from the results of the 
present experiment. The obtained interaction 
bore out the expectation that a “reaction” to 
protect a threatened self-concept should be 
manifested only, or at least mainly, in the 
high-involvement condition, in which the task 
abilities involved are valued and central to 
the self-concept of a person. It is possible, 
then, that in a situation in which self-esteem 
is “personally” devalued, the pressures for 
protection of the self-concept are more power- 
ful than the “self-consistency” pressures. 
Thus, under a favorable condition for their 
emergence—high involvement with task— 
these protection of self-concept pressures man- 
ifest themselves clearly by increased perfor- 
mance, and under an unfavorable condition of 
low involvement, they are still sufficiently 
strong to block or inhibit potential self-con- 
sistency effects of reduced performance. 
The increase in productivity which occurred 
in some of the experimental conditions did not 
come about at the expense of quality of work. 
No difference between conditions was found 
with regard to work quality. This supports 
previous findings that when equal eas 
is placed on both quantity and quality, a pro- 


that there is probably 
normal work situation 
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tection of self-concept would manifest itself 
by higher quantity, rather than by a better 
quality of work (Wiener, 1970). This is un- 
like the performance effects of the self-consist- 
ency process of “confirmation of self-concept 
of possessed abilities.” In such a situation, 
when a person believes or is led to believe that 
he is skilled and competent, he will tend to 
confirm this self-perception by higher quality, 
rather than by larger quantity of work (Kess- 
ler & Wiener, 1972). 


EXPERIMENT 2 
Method 


The method of Experiment 2 was similar to the 
method of Experiment 1, except for the following 
main differences: 

1. The involvement variable was not manipulated 
in the present experiment. All subjects were told that 
they were participating in the development of a 
“linguistic differentiation” test. It was expected that 
this induction would establish for all subjects a set 
of medium task involvement. It is recalled that 
Experiment 1 manipulated high and low involvement. 
As can be seen from the productivity results (Table 
1), the mean productivity of all subjects in Exper- 
iment 2 (69.10) was smaller than the mean produc- 
tivity of the high involvement (72.63) and larger 
than the mean productivity of the low involvement 
(60.20) in Experiment 1. This tended to confirm the 
expectation of medium involvement of all subjects 
in Experiment 2. 

2. Only one independent variable was manipulated 
in the present experiment—the situationally devalued 
self-esteem. The induction was the same as in 
Experiment 1. The other independent variable was a 
measure of chronic self-esteem that was given to all 
subjects at the beginning of the experiment, just 
before they took the pretest. The pretest and the 
main task were the same as in Experiment 1. 

The general procedure of this experiment was as 
follows: Sixty undergraduates took part in the study. 
They were assigned randomly to the two conditions 
of manipulated, situational self-esteem (devalued 
self-esteem and control), with 30 in each condition. 
Within each of these conditions, subjects were 
divided on the basis of their scores on the chronic 
self-esteem measure into high- and Jow-self-esteem 
groups. The median chronic self-esteem score in each 
manipulated self-esteem condition was the cutoff 
pont for the above division. (The mean score for 
the high-self-esteem subjects was 26.2 and for the 
low-self m subjects it was 18.8.) This procedure 
provided for a 2 X 2 factorial design with 15 subjects 
per cell. E 

The chronic self-esteem measure used was the 
self-assurance scale of the Ghiselli self-description 
inventory (unpublished manuscript described in Kor- 
man, 1968). This measure has been used extensively 
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by Korman in his seli-consistency studies. Korman 
(1968) summarizes some validity data for the 
measure. 


Results 
Manipulation Check 


Perception of capacity to perjorm the task. 
As in Experiment 1, this was a check on the 
self-esteem manipulation. A 2 X 2 analysis of 
variance on the capacity rating showed a sig- 
nificant main effect of the self-esteem variable 
(df = 1/56, p < .001). The mean capacity 
rating was higher (6.83) for the unchanged 
self-esteem group than for the devalued self- 
esteem group (5.20). It is also interesting to 
note that the main effect of chronic self- 
esteem was significant (df = 1/56, p < 001). 
High chronic self-esteem subjects perceived 
themselves to possess higher capacities than 
low chronic self-esteem subjects (6.77 and 
5.27, respectively). 


Productivity Data 


Productivity was measured as in Exper- 
iment 1. Table 1 (Experiment 2) shows the 
mean productivity of the four treatment 
groups (z = 15 for each group). 

A2 X 2 analysis of variance on the produc- 
tivity data showed a significant main effect of 
manipulated situational self-esteem (df= 
1/56, p < .025). The devalued self-esteem 
subjects under both conditions of chronic 
self-esteem, produced more than the control, 
unchanged self-esteem subjects. Some ten- 
dency of an interaction between chronic self- 
esteem and situational self-esteem can be 
observed in the data. The high chronic self- 
esteem subjects raised their productivity more 


when their self-esteem was devalued than low 
chronic self-esteem subjects? 


Quality Data 


The qu 


ality score was defined as in Exper- 
iment 1. 


A2x2 analysis of variance on the 
data did not show any significant effect. 


2 Analysis of covariance of tl 
Productivity scores from the 
Covariate, provided the same pat 
analysis of var 
level of the in 


hese data, using the 
Short pretest as a 
tern of results as the 
iance and did not raise the significance 
teraction to an acceptable level. 
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Cognitive—A fective Variables Data 


As in Experiment 1, of all cognitive-affec- 
tive variables, only the task satisfaction data 
showed effects of the experimental manipu- 
lations. A 2 X 2 analysis of variance on these 
data revealed both main effects of chronic 
self-esteem and manipulated, situational self- 
esteem (dj = 1/56, p < .025; dj — 1/56, P 
« .001, respectively). Low chronic self-esteem 
subjects (4.74) and devalued self-esteem 
subjects (4.53) showed less satisfaction poe 
high chronic self-esteem subjects (5.73) an 
unchanged self-esteem subjects (5.93), 
respectively. . : i 

The productivity results, as in moa 
1, did not support fully the experimenta 
hypothesis. The anticipated interaction þe- 
tween situational self-esteem and chronic self- 
esteem did not materialize. Devalued self- 
esteem subjects, in both high and low chronic 
self-esteem conditions produced more than 
the controls. These results do not provide any 
indication for the self-consistency effects a 
reduced productivity. On the other hand, @ 
devalued self-esteem subjects showed indica- 
tion of protection of the self-concept aem 
since all of them produced higher than es 
controls. The tendency to protect the xmi 
concept, as expected, was larger for the (oen 
chronic self-esteem subjects than for the lo 
ones, as can be seen from Table 1. - 

The present results tend to weaken | " 
alternative notion of a possible lower limit m 
performance under the low-involvement X 
dition that was offered to explain the C a 
self-consistency effects in Experiment mo * " 
could not have been a lower limit of per po 
mance in the present experiment, pe min 
subjects performed under a set of mediu 
involvement with the task. . Te 

Other explanations for the failure z ; 
self-consistency effects to be manifested = 
the low chronic self-esteem subjects a e 
not seem to be strong enough. The bar be 
the chronic self-esteem pres Pen, PR 
questioned. Ghiselli's self-descrip codd sad 
tory has not been subjected n E 
systematic validation studies. AOTC) inis 
noted, however, that Korman oie : 
self-consistency research employe tik ae Í 
ure quite successfully. Also, it is possible that f 
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the “extreme groups” procedure of assigning 
subjects to the chronic self-esteem conditions 
would have been a more sensitive treatment 
of this variable than the assignment of sub- 
jects on the basis of a median score. However, 
an inspection of the data from the present 
experiment does not give any indication that 
employing this procedure would have changed 
the results meaningfully. 

Another possible explanation for the lack 
of self-consistency effects in both experiments 
asserts that subjects may not have exper- 
ienced pressure to achieve self-consistency. It 
is possible that in spite of the negative eval- 
uation of their ability by an “authority,” they 
did not perceive themselves at all as incom- 
petent. However, it is quite clear from the 
manipulation check “perception of capability” 
that this did not happen. The devalued self- 
esteem subjects had a significantly lower per- 
ception of their competence than did the un- 
changed self-esteem subjects. 

It seems, then, that the failure of self-con- 
sistency pressures to emerge is not due to any 
methodological problems. Based on the results 
of both experiments, it seems that individuals 
just do not tend to resolve self-consistency 
pressures originated by situational-personal 
devaluation of self-esteem by reduced and in- 
effective performance. 

Instead, both Experiments 4 and 2 
suggest that of the two motivational forces 
operating when individuals’ self-esteem 1S 
devalued in a personal, face-to-face situation 
— self-consistency and protection of self-con- 
cept processes—the latter one is the most 
powerful. Under the favorable condition of 
high chronic self-esteem, the pressures to pro- 
tect the self-concept clearly manifested them- 
selves by a much increased performance. 
Under the unfavorable condition of low 
esteem, these pressures were still 
ly to inhibit the poten- 
ffect of reduced pro- 
ng about some 


chronic self- 
strong enough not on 
tial self-consistency e 
ductivity, but actually to brir 
increased productivity. y 

The results of both experiments do not 
show how the pressures for self-consistency 
were resolved. The insignificant results ob- 
tained from the cognitive-affective variables 
indicate that these pressures may not have 
been resolved at all, assuming, of course, that 


the scales were not too crude to pick up 
alternative modes of resolution of the self- 
consistency dilemma. 

The only attitudinal variable found to be 
affected by the experimental manipulations in 
both experiments was “satisfaction” with the 
task. Devalued self-esteem subjects showed 
a higher level of dissatisfaction than the 
control subjects. Even though the present 
experiments are not able to unequivocally 
determine the source of this dissatisfaction, 
perhaps it is indicative, at least in part, of 
the underlying unresolved selí-consistency 
pressures, The present experiments also do not 
provide information about the possible longer 
term consequences of the dissatisfaction of 
the devalued self-esteem subjects. It is pos- 
sible that the dissatisfaction eventually would 
have led to decreased performance or even to 
withdrawal from the task. This would be in 
line with the performance hypothesis of the 
self-consistency model. On the other hand, 
it is also possible that the dissatisfaction 
would not have any performance consequences 
or that it would wear off altogether. Data per- 
taining to these possibilities can be obtained 
from an experimental design that will allow 
long-term effects to manifest themselves. 


Conclusion 


There is very little research investigating 
the relationship between a personal, face-to- 
face devaluation of self-esteem and perfor- 
mance. One study by Korman (1970) found 
that individuals who were told that they were 
competent for a specific task performed 
better than those who were told they were 
incompetent. Korman’s experiment, however: 
did not have a control group of “unchanged 
competence.” It is impossible, then, to deter- 
mine whether the incompetent group reduced 
performance, or the competent group in- 
creased performance. i 

Several studies give an indication that when 
self-esteem is situationally devalued, but in a 
manner that is not personal and face-to-face, 
self-consistency effects may occur. Aronson 
and Carlsmith (1962) found that when a 
person’s performance is too high for his self- 
conceptions of his ability, he will subsequently 
decrease performance (‘reject success”) in 
order to maintain self-consistency. While some 
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of the attempts to replicate the Aronson and 
Carlsmith finding have been unsuccessful 
(e.g, Ward & Sandvold, 1963), other repli- 
cations have succeeded (e.g., Cottrell, 1965). 
More recently, Mettee (1971) and Mareck 
and Mettee (1972) made quite a successful 
attempt to clarify the condition under which 
this finding holds. 

Other conditions in which self-consistency 
pressures are aroused without a personal, 
face-to-face devaluation of self-esteem also 
show clear self-consistency effects. Low 
chronic “self-concept” has been found to be 
related to less effective functioning (e.g., 
Shaw, 1968). High chronic self-concept has 
been found to be associated with increased 
effectiveness of performance (e.g., Denmark 
& Guttentag, 1967), and situationally en- 
hanced self-concept has been shown to bring 
about an increased performance (e.g., Kauf- 
man, 1963; Kessler & Wiener, 1972). 

In conclusion, then, it seems that at least 
under one form of situationally devalued self- 
esteem—in which the self-concept devaluation 
is personal, face-to-face, and direct, as in the 
present experiments—self-consistency effects 
do not appear. Under such conditions, the pro- 
tection of self-concept reaction is most likely 
to occur. This reaction not only inhibits and 
blocks self-consistency effects of reduced per- 
formance from emerging, but under certain 
conditions brings about an increased perfor- 
mance. 

Some previous research from the cognitive 
consistency area tends to support the protec- 
tion of self-concept explanation. For example, 
Adams (1965), while summarizing the condi- 
tions that may affect the choice of one or 
mee m of resolving inequity dis- 

uds ed that a person “will resist real 
SE. CIA UE 
his self-concept and = hi deem 
295]." It seems that e cadere Ip. 

S Such was, indeed, the 


reaction of the devalued self-esteem subjects 
in the present experiments. 
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s CHILDREN'S CHOICE OF PLAYGROUND EQUIPMENT: 


DEVELOPMENT OF METHODOLOGY FOR INTEGRATING USER 
PREFERENCES INTO ENVIRONMENTAL ENGINEERING * 
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The study was aimed 
be used as a means of 


at determining whether preference scaling techniques could 
aiding engineering system design. Children’s play equipment 


was selected as the design area. Paired comparison and rank-order methods were 


used to determine whether eight- 
different types of play equipmen 


and nine-year-old children had preferences for 
t; whether these preferences were reliable over 


time; and whether actual use of play equipment was predictable from the 


preference scales developed. Photograph 


applied to 48 children. In addition, the 


determined using time-lapse photography. 


ic stimuli were used for the scaling and 
frequency of use of play equipment was 


The results indicated that the children 


do have stable and reliable preferences for play equipment, and these preferences 
were correlated with actual usage of the equipment. 


Engineering design and the development of 
technology, in general, is often carried on 
without systematic and operational involve- 
ment of the ultimate users of these systems. 
Aside from marketing studies and user panels 
for certain classes of consumer goods, the end 
users of engineering systems are rarely in- 
volved in their design. This is most evident in 
public works such as transportation. and 
recreational facilities. These systems are 
usually planned and engineered by profession- 
als who are presumed knowledgeable about 
the needs and preferences of the people for 
whom they are designing. Not only is this 
professionalism open to question (Bishop, 
Peterson, & Michaels, 1972), but a method- 
ology for measurement of user needs and 
preferences which designers and engineers can 
employ as part of the design process has also 
been lacking. Tt was the purpose of this study 
to investigate and validate one methodology 
i This research was supported under Grant 5-ROI- 

mental Control Admin- 


EC-00301 from the Environ 
istration, U. S. Public Health Service, Department of 


Health, Education, and Welfare, and a grant from 

the Sloan Foundation. 
This paper js part of a 

Robert L. Bishop. j P 
2 Robert Bishop is currently working as a private 


consultant in Aspen, Colorado. 
ints should 


doctoral dissertation by 


be sent to Richard 
M. Michaels, Transp i Northwestern 
University, 2001 Sheridan Road, Evanston, Illinois 


60201. 


for relating preferences to system design. To 
this end, one major design process was 
selected for study: children’s play environ- 
ment. 

In order to conduct such a study, a tech- 
nique was needed that employs representa- 
tional stimuli requiring minimal verbal 
response. The use of photographs as stimuli 
and of psychometric scaling techniques offer 
an obvious approach with children. 

Using photographs as stimulus material is 
not new. In an anthropological study, Collier 
(1957) made tests to compare interviewing 
with photographs to interviewing with strictly 
verbal stimuli and found that with photo- 
graphs the respondents were more cooperative 
and the results were more consistent. There 
have also been indications that, for adults at 
least, preferences for photographs are corre- 
lated with preferences for what they actually 
study on environments and photographs of 
depict. Coughlin and Goldstein (1970) ina 
those environments, found that the two were 
correlated significantly and therefore sug- 
gested that photographs could be used as a 
representation of the real world. Sutton-Smith 
(1965), in studies on the relationship between 
children’s responses to play inventories and 
their actual free play behavior, also showed 
that systematic relationships did exist between 
certain types of play preferences and play 
behavior. 
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This study is aimed at testing three specific 
hypotheses: 


1. Children can construct internally 
consistent Preference scales for play 
equipment using Photographs as stimulus 
material, « ; 

2. The preference scales will be consist- 
ent over time, 

3. The preference scales SO constructed 
will be correlated with actual use of the 
play equipment ina playground, 


METHOD 
Procedure 


in Evanston, 
Was chosen for the study because the Spatial relg 


at play. The Play facilities at thi 
into two areas: 


; since eight- to nine- 
ian age of children who 
est use, 


Development of Preference Scales 
The six pieces of play e 


quipment in the latter 
Were pj Otographed 


TX area 
individually. Each 


Photograph 
‘ground 


t 

(b) slide, (c) izontz : 
ae: Henon E orizontal) bars, (d) see- 
conventional, unadorned desi 
Forty-eight children int 
Were shown the Photo, 


Order tt 
tedious than paired Comparison. but the ar n 
Were quite responsive to the tests ang appeared t T 
highly motivated, They had few Probleme aik 
s "n 
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understanding the instructions or 
ordering. In general, the interviews 
about 15 minutes to complete, 


completing the 
required only 


Observation of Play Behavior 


During the months of August and September 1970, 
a time-lapse movie camera Was set up to record the 
play behavior in the Playground Where the interview 
Photos were taken. The camera was set up so as to 
allow unobtrusive filming of the entire playground. 
In order to obtain as representative a sample of 
usage as possible, filming was done on 12 occasions 
consisting of recesses, time after school, and complete 
days when school was not in session, Two thousand 
usable frames were obtained. For a more detailed 


description of the filming Procedures see Bishop- 
(1971 or Bishop and eterson (1971), 
Th i : 


(Sample A) was 
meet two criteria: 
ay activity ünd (b) use of 
its designed Purpose, This 
frames Containing Jess than 
2 or more than i those in which 
children Were near to but not using the equipment. 


S, the number of children 
using a piece of play 


ay equipment was counted and the 
Proportion of the total determined 


While this sample 
important l 

difficult to de 
apparatus legitin 


included elimination of 


being drawn, several 
Ted. Tt was often 
mine whether a child wa 


as using the 
nately, that 1s, for the function that 


Was 


For many frames, 
determine whether 


ns may haye 


occurred. 
aS hot clear hoy 


V the children 


à a I lotographs Used jn © interviews, 
E». en a Fs f legitimate Use only or Were 
Y Viewing „Pictured deyi S as play objects 
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totally rejecting frames that contained illegitimate 
or questionable use did make noticeable differences. 
After examining all the computer-drawn samples, it 
was found that two represented adequately the range 
of variation. The first (Sample B) consisted of 452 
frames which included no illegitimate or question- 
able use. The second (Sample E, consisted of 550 
frames, including frames with illegitimate or ques- 
tionable use, but only legitimate use was counted. 
Frames containing fewer than 2 or more than 12 
children were excluded from all samples. All frames 
in the samples were separated by at least three min- 
utes, Using intervals greater than three minutes did 
not alter the results, and most of the film was 
exposed at three-minute intervals. Three samples—A, 
B, and E,—are presented below for comparison with 
the results of the scaling of the photographs. 


RESULTS 


Scaling of Preferences 

The responses of the children were scaled 
using the methods of rank order and paired 
found, in both cases, that 
the scales derived were internally consistent, 
meeting the Case 5 assumption. Thus, the first 
hypothesis was confirmed: Young children 
can produce internally consistent preference 
scales using photographs as stimuli either by 
the method of paired comparison or rank 
order. 

The second hypothesis was that the two 
scaling methods would produce the same in- 
al scales for the same stimulus materials. 
the distribution of all 
differ if the two 
le. Using a ¢ 


comparison. It was 


terv 
For interval scales, 
scale differences should not 
methods produce the same sca n 
test of differences, this null hypothesis was 
tested using the difference matrices. It was 
found that the differences between the two 
scaling methods were not significant (t = 6, 
dj = 28). 

In addition to the tests on t d 
ences as à measure of reliability, à (à x 
regressions were also computed. In the Ars 
he scale values for paired comparison 
the scale values for rank 
poth testing periods were 

The correlation coefficient 7 Was 


e .069. Once again, this analysis 
; f the results from 


on the scale differ- 


found to b 69. vt 
indicates the similarity © 


the two methods of scaling. 


Scale Reliability 
In order to test the 
the interval scales were 


third hypothesis, that 
reliable over time, the 


i test procedure was used to compare both 
the paired-comparison scales and rank-order 
scales developed in October 1970 interviews 
with those developed in January 1971 inter- 
views. It was found that the two scales de- 
rived using paired comparisons were not sig- 
nificantly different (7 — 1.27, df = 28). 
Similarly, there were no significant differences 
between the two rank-order scales (¢ = .43, 
df = 28). Finally, there were no significant 
differences between the scales produced by the 
two methods in January 1971 (4 = .42, df= 
28). 

In addition to these tests, the correlation of 
the two methods over time was determined. It 
was found that the correlation between scale 
values from both methods taken in October 
1970 and January 1971 yielded an 7 of .952. 
This analysis further confirms the reliability 
of the scaling procedure and the stability of 
children’s preferences for the play equipment. 

In sum, it is reasonable to conclude that the 
eight- and nine-year-old children were able 
to scale their preferences for play equipment 
using photographic stimuli. They were able 
to produce internally consistent interval 
scales, In addition, their preferences were 
found to be unchanged when they were tested 
three months later with the same stimulus 
material. 

The results clearly indicate that in the 
aggregate, eight- to nine-year-old children do 
have stable preferences for different play 
equipment. The scales derived by both 
methods for the two periods are shown in 
Figure 1. 

Note that zero on. these scales is arbitrary. 
A simple linear transformation was used to set 
the least preferred piece of equipment as the 


zero point. 
Behavioral Validation of the Scales 


1f the preference scales are valid predictors 
of behavior, then there should be a statisti- 
cally reliable relation between the scales of 
preferences and the frequency of use of the 
play equipment by eight- and nine-year-old 
children. As was described earlier, the 
frequency of use was derived from time-lapse 
photography of the playground containing 
the equipment used in the scaling experiment. 
Three different samples of choice behavior 
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Fic. 1. Children’s Preference Scales. 


were selected from the film record, In each 
case, the proportion of use of each of the six 
pieces of play equipment was computed, From 
the interval scale values for each of the play 
devices, a set of expected proportions of use 
Was derived, A chi-square test was then 
used to test the goodness of fit of these two 
sets of proportions, The results of the chi- 
Square tests showed that the interview scales 
y different for Case A, 


sample of film 
selected f= 7.07, df = 


the three samples of observed 
shown in Figure 2, 

In addition to the chi- 
regression was carried out be! 
proportionate use of the play equipment and 
that predicted from the interval scales, The 
correlation coefficient was found to be 257 
(p < .01). Thus, from both these analyses, it 
is possible to conclude that the intera scales 
derived from scaling me 


d from | thods using photo. 
graphic stimuli were a re 


r asonable, though far 
from perfect, predictor of actual use of play 
equipment by eight- to nine-year-old children, 


Square tests, 4 
tween the actual 
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tive behavior which may be conditional in 
most playground situations. In these terms, 
the preference scale may measure what the 
children would do, necessary conditions being 
present, while the actual choice behavior may 
reflect what the children can do with the 
equipment. 
e restriction in range 


Another reason for th 
of preference scale may be the nature of the 


scaling process. Both paired comparison and 
rank order are based on the law of compara- 
tive judgment. As has been noted, this 
assumption is valid only under restricted 
circumstances (Stevens, 1966). The alter- 
native was to use ratio scaling methods. How- 
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ever, the conceptual difficulty of these tech- 
niques for children of this age seemed insuper- 
able and they were discarded. 

In sum, three conclusions may be drawn 
from this study. 


1. Predictions justified by the photo inter- 
views are not contradicted by any of the 
scales predicted from the behavioral obser- 
vations. 

2. For those preference relationships in 
which the two measurement approaches agree, 
it is clear that the photo interview method is 
less sensitive than the behavioral observation. 

3. Observed differences in the attractiveness 
of the several pieces of play equipment are 


a 


= 95% Confidence Interval 
~ 
c 
is 


Sk 


Sample B 
(n = 452) 


Fic. 2. Comparison of scales. 
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quite small, partly because. of the photos 
and partly because of the scaling method. 


Finally, the results of this research suggest 
that scaling techniques can be used as part 
of the design process. At one level, play 
equipment with certain characteristics are 
clearly preferred over others, for example, 
swings versus high-low bars, Knowledge of 
these preferences can direct the designer of 
play equipment to synthesize new kinds of 
equipment with reasonable assurance that 
they will be acceptable to and used by chil- 

ES for whom they are designed. This study 
actually tests a methodology for exploring 
children's. preferences, Using this class of 
methodology to examine attributes of play 
equipment and the abstract qualities of those 
materials is essential for synthesizing designs, 
This aspect of the problem is addressed in 
another phase of the 


al., 1972), 


attributes, Using 
bolic stimuli, th 


is something that is rarely done in the engi- 
neering process. 
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UPON PRODUCT PERCEPTIONS 
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Individuals differi: 
varying systematically in atypical 
them. The more in 
than did the tolerant one. 


related to wil 
among intolerant ones. 


vidual's personality influence 
f a product as new? The an- 
swer to this question may help explain why 
many investigators (&.8«, Koponen, 1960; 
Robertson & Myers, 1969) have reported 
weak or inconsistent results in analyses of the 
relationship of personality to new product 
ce. Since many of these studies did 
the new products Were 


actually seen as new by individuals with given 
characteristics, these results may have been à 
function of this uncontrolled variation. Also, 
ld help clarify the dynamics 


the answer wou 
relating personality and innovation proneness 


(cf. Jacoby; 1971). 
The present study asked whether the per- 
ception of unusual products as new could be 
a function ofa specified personality character- 
istic— intolerance of ambiguity—and be re- 
lated to acceptance of those products. 


A product may be objectively defined as 
new because (a) it is incompatible with cul- 
tural behavior patterns (Barnett, 1964), (2) 
it has been recently introduced to the market 
(e.g. Jacoby, 1971), or (c) for other reasons 
(cf. Rogers & Stanfield, 1967). Factors which 
make à product appear subjectively newer 
may be those making it more unfamiliar, that 
is, those which reduce one’s ability to estimate 
duct attrbiutes and the consequences of its 
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ng in intolerance of ambiguity judged the newness of products 
lity and reported their willingness to buy 
tolerant person perceived the atypical products as newer 
Perceived product newness tended to be positively 


lingness to buy among tolerant individuals, but negatively related 


Perceived newness of products may also de- 
pend on the extent to which a person charac- 
teristically finds ambiguous, unfamiliar stim- 
uli aversive, that is, on his intolerance of 
ambiguity (Budner, 1962). The intolerant 
individual's greater aversion to ambiguity may 
lead him to avoid unusual products generally 
and, hence, to have less experience with them. 
The greater unfamiliarity of such products 
would lead him to judge them to be newer 
than would individuals who are more tolerant 
of ambiguity. 

Finally, because the person intolerant of 
ambiguity finds the unfamiliarity of atypical 
products more aversive, he should be less will- 
ing than the more tolerant person to buy 
them. 

METHOD 
Subjects 

Thirty-two male students attending an under- 
graduate psychology cass at Purdue University 
served as subjects. These subjects, selected on the 
basis of their classification as high or low in intol- 
erance of ambiguity, had scored in the first and 
fourth quartiles of the students in that course (N — 
eem (1962) measure of intolerance of 


Instruments 


» Dee cd pe pesa of product advertisements, 
E x of two-sentence product descriptions 
va constructed which represented many product 
a em levels and contained such descrip- 
ech A home wine-making kit produces up to 

bottles of wine per batch. The home wine-maker 
requires little attention and is odor free." Each de- 
scription portrayed three functional attributes of the 
product, mentioned the product class twice, and 
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TABLE 1 
DEsIcn or Propucr (P) SAMPLES 


Version High Medium | Low 

Sample A 

1 Pl (HM) | P3 (HM) | ps (HM) 

P2 (LM) | P4 (LM) P6 (LM) 

2 Pl (LM) | P3 (LM) P5 (LM) 

| P2 (HM) | pj (HM) | Pó (HM) 

= ee a, 
Sample B 

1 P7 (HM) | po (HM) | pi (HM) 

3 P8 (LM) |P10 (LM) [ma (LM) 

3 2 P7 (LM) | pg (LM) | Pit (LM) 

P8 (HM) | po (HM) | P12 (HM) 


iations: HM = highly Prestigious manufacturer, 
turer, 


ious manufact 


Contained no affectively toned Words nor Statements 


P : € products w €re cloth 
ome er oat ees itr 

an sh sampl ere Were two Products high 
ae eis (X x29), two mediim (oae y = 
d e ne T 3.8). Within each sample 
Dz bs! Not differ by £ test (p> 
differ a 


m 10) ed atypicality, but did 

Other two levels, Ad 0) from Products in the 
$ 12 elected Products ywi 

moderately useful (200 = were rated 

To Contro] 4 pot. 


3 3 = ential source o¢ i 4 
in Which subjects? jug men i experimenta] bias 


vary with their estimates f the a product might 
tige (eg., Bauer, 196 ), each anufacturerg Dres- 
with the name of TOduct was comb} 
Corporations were combj seenufactur 
products; two Were high (eg, RU 
two were low (eg. Gordon Tal Moto 
Eauged by a Separate pilo 
versions of the Product 
first version of a sample, o n 
each level of atypicality was pn; D Products 
manufacturer and the other 


one at each level of atypicality, 
Four product booklets were then constructed. Each 


tors, 


Administration 


Subjects Participated in two allegedly separate 
Studies: the first, supposedlv to standardize a self- 
description questionnaire; the second, to pretest stu- 
dent Teactions to various consumer goods for a future 
experiment, 


In the first "study," all subjects were given the 
questionnaire, Which included 
i i ' Scale and distractor 

X, about political 
y,” each subject was 
roduct tests and 
5. The latter in- 


cality scale, (b) a 7-point graphic s 
to buy a product Compared to unspecified other 
Products in that class which w '5 anchored at very 
unwilling (1) and very willing (7), and (c) a 7-point 
graphic scale of Product newness which was anchored 
at very old (1) and very new (7), Subjects were 
instructed to read all product descriptions before 
Tating one and to rate all products 9n à given scale 
before Proceeding to the next scale, 


RESULTS 


Ratings of product atypicality, newness, 
and willingness to buy were analyz 
rately for Product Samples A and B 


* by a2 
(intolerance) X 2 e 


(source prestige) x 3 
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The main effects 
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Ei 
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= Among Experiment 
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2 sinas of products representing a specific leve] 
sq Atypicality m Sample A were frequently 
€ mean atypicality ratings of Produ 

$ ality ci 
Presenting the corresponding level of atypi 
ample B, ce, main analyses wer 
Separately for each Product sample, 
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p < 01). In Sample A all three Jevels T 
different from each other at the .01 level. In 
Sample B high-atypicality products m 
judged more atypical than products mediun 
p< .01) and low (p< 01) in aby’ 
put the latter two did not differ sign 
In neither sample were intolerance or sou 


prestige elfects significant. 


rce 


Newness 
sample A 


. The main effects of atypicality 
j= 2/60, P< .01) and of in- 

, dj 1/30, P< 01) 

was the Atypicality X Tn- 
p = 4.33, df = 2/60, P 
e 1) Intolerant subjects 
--atypicality products as 
a 3.63) than did the tolerant sub- 


newness as à function of intoleran 
and product atypicality. 


ce of ambiguity 


jects (X= 2.41, p < 01), but also judge 
the medium-atypicality items newer (X- 
3.25 vs. X = 4.03, respectively, p < .01). Th 
intolerant (X — 6.28) and tolerant subject 
(X = 6.12) did not differ in rated newness € 
the high-atypicality goods. Further, amor 
both subject groups, the high-atypicali! 
goods were judged newer than the medium (; 
< 01) and the low ones (ps < 01), and tl 
medium newer than the low (ps < 01 
Source prestige was significant neither as 
main nor an interaction effect. 

Sample B. The main effects of atypicali 
(F = 848, dj = 2/60, p < .01) and of int 
erance (F = 7.41, df = 1/30, P< 05) we 
again significant, whereas the Atypicality 
Intolerance interaction was not (F= 13! 
Consistent with the analysis of Sample A, 
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tolerant subjects (X — 5.02) judged all prod- 
ucts as newer than did tolerant subjects (X= 
4.10). Further, both subject groups evaluated 
the high-atypicality items (X= 5.15) as 
newer than the medium (X = 441, p < .01) 
and the low ones (X = 4.14, p< .01); the 
latter two did not differ significantly. Source 
prestige effects were not significant. 


Willingness to Buy 


The main effect of atypicalit 
cant with both Samples A (p = 


5.42, df= 
2/60, p < .01) and B (F 


= 29.50, dj = 2/60, 
subjects were more 
atypicality products 
dium (X = 3.70, p< 
) X-2327,5- 01) ones. 
Similarly, in Sample B Subjects were more 
willing to buy the low (X = 5.25) than the 


5) or high (x — 


to buy were 
for Sample A 


B and — 2$ 


; less willing were intol- 

Bets mds bp it. Among tolerant sub. 
5 r han Y i 

[9 es) Wiss d, Correlations were .47 


aple B and .13 "5) wi 
Sample A Products; the Newer Eo 
? 


the more Willing Were r 
5 tolerant Subjects to buy 


atypical Products Were 
y all Subjects. those more 
guity Senerally ju 

atypical products as newer 


tolerant subjects. Further th 
à d e new, 
erant subject perceived the ^ US tol. 


than 


more willing he was to buy it, whereas the 
newer the intolerant subject judged a product 
to be, the less willing he was to purchase it. 
Since there was no difference between the two 
Sroups in perceived atypicality, the differ- 
ences in perceived newness and in acceptance 
could not be explained by that factor. The 
above results are consistent with the following 
assumptions (Budner, 1962): (a) The greater 
aversion of the intolerant person to ambiguity 
leads him to have less contact with atypical 
products and, hence, to see them as newer 
than does the more tolerant individual. (b) 
Due to his aversion to the greater unfamili- 
arity of the newer products, the intolerant 
person finds newer products less desirable than 
older ones; the tolerant person, on the other 
hand, finds newer products more desirable 
than older ones due to their ambiguity being 
more challenging or otherwise rewarding. 
However, since the tolerant and intolerant 
subjects were equally less willing to buy the 
more atypical products, extension of the basic 
assumptions of the study is necessary, Per- 
haps, due to greater famili 
products, product perceptic 
sons reflect more dimensio; 
more intolerant individu 
cality may be negatively 
other dimensions m 
(cf. Blake, Perloff, 
Summers, 1967). In 


Positively re- 


lated to acceptance, the to erant individual 


al products as newer 
d, second, to be less 


willing to p eptually newer prod- 


uy the perc 
ucts, 
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studies (ci. Jacoby, 1971; Robertson & My- 
ers, 1969) may have been due to differences 
among the personality groups in the perceived 
newness of the products. The present finding 
that willingness to buy was related in differ- 
ent ways to product atypicality and to per- 
ceived newness suggests the importance of 
studying the dimensionality of perceived new- 
ness of products (cf. King & Summers, 1967). 
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FEATURE SECTION: STUDENT EVALUATION OF FACULTY 


STUDENT AND DEPARTMENT CHAIRMAN VIEWS OF THE 
PERFORMANCE OF UNIVERSITY PROFESSORS 


DANIEL N. BRAUNSTEIN 2 


Oakland University 


GEORGE J. BENSTON 


University of Rochester 


Evaluations of students for most university courses taught over ped 
period by 347 professors were compared to rankings made j Pn 
chairmen of their faculty. The faculty were ranked on the basis a me inet 
visibility, current research, teaching impact, communication ability, ne s 
mental contributions. Sixteen of 27 rhos computed for visibility and s de 
evaluations of teaching were negative. A substantial number of rela HODIE 
for research were around zero. Relationships for teaching and communications 
Were moderately positive. One-year stability Coefficients of rankings by pan 
men were high for a single chairman, but considerably lower when a DM 
of chairman took place. In a chairman's view, research and visibility are highly 
related, but effective teaching is only moderately related to these performance 
criteria. 


Few people would argue against the prop- 
Osition that universities ought to encourage 


good teaching and that Such encouragement 
would ultimatel: 


y yield valuable returns to the 
institution, However, university administra- 
tors typically have no organized way of 
obtaining information about the quality of 
performance of their faculty. 
survey of the American Coun- 
on (Astin & Lee, 1967), the 
Most frequent sources used by department 
chairmen to evaluate teachi 
Teports and a review 
publications, 
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According to a 
cil on Educati 


ng are anecdotal 
of scholarly research and 


3 5 (b 
Onships between the M 
1 This research is s 
ESSO Education Foundation, 
The authors acknowl 
Woodford in i 
problems involved in this 
? Requests for reprints Should be 
N. Braunstein, School of E 
ment, Oakland University, 
48063. 


UPPorted by a grant from the 


e sent to Daniel 
Conomics and Manage- 


> Michigan 


244 


sets of evaluations. This study consists of a 
detailed examination of different indices of 
these reliabilities and of the observed relation- 
ships between them. . > 
Probably because of difficulties in obtaining 
the cooperation of university faculties and 
administrators, there are few published stud- 
ies measuring the processes used by admin- 
istrators to evaluate teaching and their rela- 
tion to student views, Although a number of 
reliability analyses of ratings by students (see 
a recent review by Costin, Greenough, & 
Williams, 1971) show substantial test-retest 
coefficients over time as well as substantial 
internal consistency coefficients (70s and 
80s), the authors are not aware of previously 
published data for ratings by department 
chairmen. Hayes (1971) found that depart- 
ment heads associated good teaching with 
research ability, but that student evaluations 
of teaching were not related to that research 
ability measure. Bressler (1968), using sci- 
ence departments only, did find higher teach- 
ing ratings from students for faculty who held 
research grants; however, his statistics were 
criticized by Quereshi (1968). Because only a 
few studies of the relations of research activ- 
ity to student teaching evaluations disclose 
small positive relationships between the two, 


Costin et al. (1971) summarize 


the findings 
as, at best, weak. 


In the present study, the 


$ 
b- 
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investigators designed a faculty-ranking pro- 
cedure that department chairmen used twice, 
during successive summers, for five faculty 
performance criteria that included both teach- 
ing and research. Student and alumni eval- 
uations of faculty teaching were collected for 
four semesters during the same period. The 
objectives were (a) to evaluate the consist- 
ency of student evaluation over pairs of all 
courses compared to repetitions of the same 
course; (b) to compare the rankings of the 
same faculty made one year apart by the 
same department chairmen, as well as those 
made by a sample in which the department 
chairman changed; and (c) to relate the two 
sources of faculty evaluation. 


METHOD 


Student Evaluations 


A standard questionnaire (student opinion ques- 
as used in about three quarters of the 
courses at the University of Rochester Colleges of 
Arts and Sciences, Engineering, and Management for 
four successive semesters. This questionnaire CON- 
tains items describing aspects of the instructor’s 
behavior, such as availability to students, use of class 
time, clarity of assignments, stimulation of interests, 
etc., as well as two items covering an overall eval- 
uation of the course and an overall evaluation of 
the instructor. The items are constructed with a 
5-point scale. . 

A check was made of the correlations between the 
overall evaluation items and the other items in the 
questionnaire, since it was assumed that for the 
purpose of the study only the overall evaluations 
would be used. Product-moment correlations, using 
class median ratings, ranged from .84 to .88, with 
a median correlation of .65 (n= 193). The corre- 
lation for the two overall evaluations was - 

The Faculty Senate Committee on the Improve- 
ment of Teaching urged that faculty distribute the 
student opinion questionnaire in their classes. The 
student opinion questionnaires were processed by the 
committee, and the data were made available to the 
researchers (one of whom was chairman of the 
committee) with the approval of the faculty senate. 
Prior to the distribution of the questionnaires, à 
letter to each faculty member was sent requesting 
that his students be allowed to participate in this 
project. The anonymous questionnaires were filled 
out in class during the last weeks of the semester, 
placed in a dialed envelope, and returned by a stu- 
dent to the university student counseling service. 
Both graduate students and undergraduates pur 
pated in approximate proportion to their enroll- 
ment in the university (1:3). Over the four semesters, 


a total of 713 individual courses were evaluated. 


tionnaire) Wi 


Departmental Chairmen Evaluations 


The administrators data were collected from 
departmental chairmen and from the Dean of the 
Graduate School of Management, which has no 
departments. The data included information regard- 
ing 347 professors in 18 of the 21 departments of 
the Colleges of Arts and Sciences, Engineering, and 
Management. These data were collected twice, dur- 
ing the summers of 1968 and 1969, through the use 
of an “alternation ranking” procedure designed to 
maximize the reliability of extreme ranks. 

Decks of cards on which the names of faculty 
were printed were provided to the chairman for his 
convenience in determining the ranking. In an alter- 
nation ranking, the judge ranks the highest member 
of the group first, then the lowest member, then 
chooses the next highest, second lowest, and so forth 
from the remainder of the card pack. 

The chairmen were allowed to eliminate from the 
deck any name for which they had no information 
on which to base a rank. However, they were en- 
couraged to rank a professor wherever possible. 
After the deck was complete, the chairmen then 
placed divider cards in it, separating the faculty into 
five categories: among the best, above average, 
average, below average, and considerably below 
average as compared to faculty in comparable 
departments. The bottom two categories were used 
sparingly and, therefore, were combined for the 
analysis. Faculty whom the chairmen could not 
rank were grouped separately. 

The chairmen sorted five duplicate decks of name 
cards in the above fashion. Each of the decks was 
sorted on the basis of a different criterion: overall 
teaching performance (which includes knowledge of 
the subject), ability to communicate, current re- 
search performance, professional visibility, and con- 
tribution to the department (or college). Each of 
these criteria was described by a short written 
definition, which was included along with the rank- 
ing instructions sent or given to the chairman. In 
order to minimize fears of public disclosure, which 
would contaminate results, all participants were 
assured that individual faculty data would be kept 
confidential by the investigators. 


RESULTS 
Evaluation by Students 


A coefficient of stability was computed for 
the student ratings in two ways: (a) from 
pairs of all courses taught by the same in- 
structor and (b) from pairs of the same 
course taught by the same instructor. Since 
at this university professors rarely teach the 
same course in adjoining semesters, the sec- 
ond computation is confounded by lags in 
time between the pairs. These product- 
moment correlations must be evaluated in the 
light of the fact that the distribution of the 
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TABLE 1 halves substantiate findings of Shock, Kelly, 
MEDIAN RANK-ORDER CORRELATIONS FOR and Remmers (1927). This early study found 
DEPARTMENT CHAIRMAN RANKINGS OF that a class size over 25 tends to yield more 
Tuei Facorrv (1968 with 1969) acceptable reliabilities. In our study, reliabil- 
ities tended to increase at a decreasing rate 
du Same | Different with increase in class size, with the coeffi- 
E edi p [^ Eg cients for class sizes above 25 being in the 


70s and 80s. For all class sizes, the median 

Professional visibility 87 49 coefficient was .68. For classes in the fields 

Eee pedibus 20 ur of social science, management, and natural 

porten E 17 science, such reliabilities were higher than 

Ability to communicate 75 ‘40 those in the humanities. Although computed 

Contribution to the depart- for half-class sizes, these coefficients have not 
ment 43 48 


been estimated for full class sizes by the 
Spearman-Brown prophecy formula. 


ter Evaluation by Chairmen 
suggesting the presence of a ceiling effect 


Table 1 presents median correlations for 
For 179 pairs of different courses taught the departmental chairman evaluations of 


efficients drop 


all evaluation 
ted 61 (p 
course corre- 
be predicted, 
were included 
mester apart, 
ped, both for 


faculty, department by department, computed 
between their rankings from one year to the 
next. Additionally, median correlations are 
also given for the rankings obtained from 
departments where a new chairman had been 
designated during the second year. 

Notice that for the same chairmen, 
liabilities are very high for visibilit 
research and somewhat lower for te 
communication, and departmental contri- 
bution. Three department chairmen were 


the re- 
y and 
aching, 


z highl i i i ki 
courses were paired ghly Consistent in their rankings of both 
regardless of when teaching and research during the period. 
; Teg ^ i 
ai ow E rank-order corri s 
abilities are some- wever, the obtained rank-orde elation 


m 68 (p < 01) for a sample of el es = sd ce onc that 
aps of instructor evaluation and 61 (p < changed e stie MELE Cans! e 3 eua an 
3 oF course evaluation W. every case and were very small for teac ing 
was broken d s. When the sample 


own by colleges, some interesting performance. 


! ae ER rie 4 The relationships between the chairman 
coefficients, based up r 


: ; tions ional visibili 
On 45 pairs, are higher evaluations of professional visibility, research 


s TABLE 
ourse ratings, Grad- ROLE 2 


29) à management coefficients ins Merpian RANK-OnDER CORRELATIONS BETWEEN 
; m ower, 57 (pe (013. and 42 Gg CHAIRMAN RANK 
:05). For the engineering sample, the repeti E , i gem aie 
tion was too small to compute. | peti- Criterion | 1968 | 1969 
An alternatiy E Vit c = 
ea a roi ipeficient of stability Visibility versus research | .70 | .70 
4 Y 18 to esti. a Teaching versus reseg 1 | os 
amount of interrater agreement bet Rode i Shine Versus visibility | S 
bers of any single c] Siween mem. Visibile n mmünies g 
y single class, Samples constr Aisibllity versus communication 33 24 
by randomly splitting Classes in half and to caching versus a | ii a 
relating the ratings obtaj E E e e. 
ained on File: m VEO EE EE 
the Separate the sane qutt deparuments in which the chairman remained 1 
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TABLE 3 
RELATIONSHIP OF CHAIRMAN EVALUATIONS TO STUDENT VIEWS 


or Facutty TEacuixG (Ra 


ORDER CORRELATIONS) 


Chairman criterion 


Area aia _| | 
correlations | Visibility | Teaching Research Communi- Depart- 
E , E pL A | cations mental 
Humanit ri —.09 36 | 24 46 03 
Social science | 7 =21 | 40 | —.05 A6 42 
Natural science | 6 | 48 | 30 | —.04 50 26 
Engineering | 5 —.07 | 49 36 -60 54 
Management 2 —42 11 —31 37 = (13: 
No, negative rhos of 27 16 | 0 14 1 8 
I 


performance, teaching performance overall, 
ability to communicate, and contribution to 
the department were analyzed for both years 
(see Table 2). The highest and most stable 
of the pairs of median correlations given in 
the table are for visibility versus research and 
teaching versus communication. When the 
teaching rankings are correlated with rank- 
ings of either research productivity or visibil- 
ity, the rhos drop. 

The five faculty evaluations made by de- 
partmental chairmen were related to evalu- 
ations of overall effectiveness made by stu- 
dents for each faculty member. In order not 
to assume any sophistication in measurement 
beyond simple ordinal effects, an analysis 
made using Spearman rank-order cor- 
For each university de- 
rized matching program 
was developed which enabled the researchers 
to identify and rank all those faculty for 
whom chairman evaluations as well as student 
course evaluation data were available. Rank- 
ing of faculty according to the students was 
compared to rankings for each of the five 
criteria used by the chairmen. In order to 
e the maximum use of available data, 
rho was computed for each department, for 
each of the two years in which a chairman 
evaluation was available, using a mean of all 
student evaluations of “overall effectiveness 
of the instructor” as the basis for the student 
ranks. Ranks based upon individual courses 
and related to the chairman evaluation made 
after the end of the course yielded many 


fewer matches, hence, à smaller sample; how- 


was 
relation coefficients. 


partment, a compute! 


mak 


ever, the results were similar to the overall 
analysis. 

Certain conclusions can be drawn from the 
data, as shown in Table 3. Visibility and re- 
search performance seem to have little to do 
with effective teaching. Visibility, as evalu- 
ated by the chairmen, is often somewhat 
negatively related to the student view of the 
effectiveness of the instructor, although the 
obtained median rhos are not substantially 
different from zero. Sixteen of 27 rhos com- 
puted are negative. Similarly, many of the 
relationships between chairman evaluations of 
research and student evaluations of teaching 
are close to zero. Fourteen of these rhos are 
negative, but a substantial number of depart- 
ments from the humanities and engineering 
sections of the university yielded moderate 
positive relationships. However, much 
stronger positive relationships exist between 
the administrative and student views of teach- 
ing. In all subject areas, the chairman eval- 
uation of faculty communication skill is more 
closely related to student evaluation of teach- 
ing effectiveness than is the overall chairman 
teaching evaluation. In three areas of the uni- 
versity, departmental contributions are also 
positively related to student views of 
teaching. 


Alumni Dala 


One further analysis of the stability of 
student evaluation was undertaken in this 
research. A sample of recent alumni (those 
who had graduated within the past five years 
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from both undergraduate and graduate pro- 
grams of three departments) was asked to 
evaluate the teaching of faculty in specific 
courses they had taken while at the univer- 
sity. For the social science and humanities 
departments, ps of .85 and .84 were obtained 
between ranks of faculty made by current stu- 
dents and those made by alumni. For a 
natural science department, this correlation 
was .43. Thus in two areas of the university, 
essentially similar ranking of faculty teaching 


was obtained from alumni and from current 
students. 


Discussion 


A key view held by many departmental 
chairmen and some higher level administra- 
tors suggests that since enough informal in- 
formation is available to them regarding dif- 
ferences in teaching effectiveness of their 
faculty, no formal method of obtaining 
data is needed. This view has two important 
implications: (a) It is sufficient to obtain on 
a casual basis informal comm 
ested students. (b) Aw. 
member’s research and 0 
to analyze his field is t 
in evaluating the impac 
added to this view is 
student’s evaluation ¢ 
ably after finishing a 
sidered opinion rende 


ents from inter- 
areness of a faculty 
f his expressed ability 
he most important aid 
t of his teaching. Often 
the suggestion that a 
ould change consider- 


g on 
The Stability coefficients of 


À i Student eval- 
uations, while substantially 


greater than 


Eu ie especially in different 
- It therefore Seems naive to offer a 


: Opportunity to give his 
chairman a Sample of his course evaluation: 
rather than Submit hi i ne 


AND GEORGE J. BENSTON 


does not seem prudent, but the use of all 
student ratings promises to yield information 
that has substantial, but far from perfect, 
stability. 

Clearly, the departmental chairmen relate 
faculty research to visibility and teaching 
with effective communication skills. But their 
rankings of the teaching and research criteria 
are less similar. In the light of all these data, 
the popular use of research and publication 
as a method of evaluating university teaching 
is highly suspect. 

It is quite clear from these data that uni- 
versity departmental chairmen share only 
moderate agreement with students when they 
are specifically evaluating teaching. With two 
exceptions, evaluations of visibility and re- 
search show little relationship to student 
views of teaching and, indeed, may be 
slightly negatively related. Recall that in the 
eyes of the chairmen, teaching, especially 
communication skill, is not highly related to 
visibility and research. 

Of course, these data suffer from the limi- 
tations of a single specific academic environ- 
ment as a collection point, They do corrobo- 
rate the results of a study recently published 
from a similar research-oriented university 
environment (Hayes, 1971). This article 
concluded that department heads related 
leaching and research ability, but that the 
relationship between research ability and stu- 
dent evaluation of teaching was very weak. 
Since the study used absolute measures rather 
than ranks and was not analyzed for reliabil- 
ity, department by department, further 
comparison cannot be made, 

Why is research performance evaluated by 


administrators seemingly unrelated to student 
evaluation of teachin 


university, 

ceptionally 
indeed be f 
teachin 
(1971 
know 


Costin et al. 
eived subject 
t in students, 


graduate- ; 
Ei ate- undergraduate differences, as well 
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as the area differences perhaps indicated in 
Table 3. One possible clue as to the reason for 
student-administrative differences in percep- 
tion of good teaching may be found in the 
correlations with the chairman's communica- 
tions criterion. For all ive medians, these are 
higher than correlations with the chairman's 
view of teaching. Perhaps an important com- 
ponent of student evaluation of teaching is 


communications ability, which may have little 


to do with highly ranked research output. 
Given these results, the authors must 
seriously question any university policy which 
evaluates teaching effectiveness (at least as 
he eyes of students, who are 


seen through t l 
the most immediate and tangible consumers 


of the product) implicitly through the process 
of evaluating research and visibility, unless of 
course the university does not seriously wish 
to evaluate teaching at all. The data also 
suggest that while requiring an administrator 
to make an explicitly separate evaluation of 
teaching may have merit, information avail- 


able to him does not lead to rankings of 
faculty which are substantially the same as 
those obtained from students. 
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LEADER BEHAVIOR DIMENSIONS RELATED TO 
STUDENTS' EVALUATION OF 
TEACHING EFFECTIVENESS' 


BAT-SHEVA LAHAT-MANDELBAUM anv DAVID KIPNIS 


Temple University 2 


Three hundred and Seventy-eight students described the behavior of an in- 
structor using an adaptation of Fleishman's Supervisory Behavior Description 
Questionnaire. In addition, students evaluated their instructors’ ability 
to teach. It was found that (a) instructor consideration was the main factor 
related to student evaluations of their instructors; (b) graduate students 


emphasized consideration les 
graduates; (c) consideration i 
instructors high in consideratio 
evaluations, 


In many colleges, students routinely eval- 
uate the teaching effectiveness of their in- 
structors. Once gathered, this information 
may be published by students as an aid in 
Choosing courses and may be given weight by 
personnel committees when decidi 
Such as tenure, promotion, and salary in- 
creases. As a result of these uses, course eval- 
uations are not a matter of indifference to 
faculty, Poorly rated instructors tend to 
mutter about "refusing to pander to the 
crowd," while the favored few speak lightly 
about “letting the kids have their say." 

Despite the widespread use of these eval- 
uations, there is remarkably little information 
Concerning what influences students to state 
that one teacher is outstanding and that 


another is poor, This study reports an initial 
Investigation of this matter. 

Our startin 
teachers’ beh; 


ng matters 


tational leadership behavior. This 
assumption 15 based upon many factor 
analytic studies that have consistently 
vealed that twi 


re- 
t 0 general dimensions account 
for a wide range of teacher behavior (Cof 
man, 1954; Gibb, 1955; K : 
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S and initiating structure more than under- 
nteracted with initiating structure so that for 


n, high initiating structure did not influence the 
but for instructors low in consideration, 


Scores were associated with poor evaluations. 


high-initiating-structure 


Ryans, 1960). The first dimension relates to 
the teacher's personal relationship with stu- 
dents—his attention to emotional and social 
aspects of student's classroom life—and the 


second refers to an emphasis on the content 
of the course and the learning tasks. 


These two general dimensions seem related 
to the leadership dimensions of consideration 
and initiating structure identified in factor 
analytic studies of leader behavior within 
industrial and military settings (e.g., Fleish- 
man, 1951, 1953: Halpin & Winer, 1952). If 
we accept the assumption of similarity be- 
tween teaching and other kinds of organ- 
izational leadership, then some cues as to the 
relative importance students ascribe to these 
two dimensions is available from an exami- 
nation of industrial studies concerned with 
how employees evaluate their foremen. Such 
studies suggest that subordinates prefer super- 
visors high in consideration, Both direct 
measures of job dissatisfaction and indirect 
measures of dissatisfaction such as absentee- 
ism, accidents, grievances, and turnover have 
been found to be positively associated with 
high structure and low consideration (Fleish- 
man, Harris, & Burtt, 1955), Fleishman and 
Harris (1962) have also found an interaction 
between consideration and initiating struc- 
ia hae supervisors low in consideration 
df the E urnover and Rrievances regardless 

Ount of Structuring done, However, 


high-consi ; 5 
5"-Consideration supervisors could increase 


y 


A 


i 


LEADER BEHAVIOR DIMENSIONS 251 


structure without increasing turnover and 
grievances. 

The present study examines the relation 
between the two dimensions of consideration 
and initiating structure, as measured by the 
Supervisory Behavior Description Question- 
naire (Fleishman, 1953) and students’ eval- 
uations of teachers. Underlying the use of this 
questionnaire, which was developed in another 
organizational context, is the assumption that 
teacher-student relations are similar to the 
foremen-subordinate relations in that teach- 
ers, like foremen, must concern themselves 
with both the feelings of pupils and their 
level of effort. The main purpose of the study 
was to examine the relative weight given to 
each of these dimensions by students when 
evaluating the effectiveness of their instruc- 
tors and to see if the findings of Fleishman 
and Harris (1962) hold for the student- 
teacher relationship. 

A second purpose of the study was to 
examine whether there were differences in the 
relative weights given to consideration and 
initiating structure by beginning and ad- 
vanced students when judging the effective- 
ness of their instructors. 


METHOD 
Sample 


Three hundred and seventy-eight Temple Univer- 
sity students, relatively equally divided between 
freshmen-sophomores (n= 117), junior-seniors (n 
= 184), and graduate students (t — 107), were asked 
to think about an instructor they had during the 
present semester. They were asked to evaluate the 
ability of the instructor they Were thinking of and 
to describe his behavior on à modified version of 
Fleishman's Supervisory Behavior Description Ques- 
tionnaire. The study was conducted two weeks 
before the end of the semester to avoid the possibil- 
ity that students would evaluate teachers on the 
pasis of the final grade they received. Students were 
equally divided between those majoring in the social 
sciences and natural sciences. Since initial analyses 
revealed no differences between these two groups on 
the major dependent variables, they were combined 


to form one sample. 


Instruments 
Evaluation of instructors. The following two ques- 
i i instructor: 
tions were use to evaluate an insi 
1. Taking everything into account how would 


you evaluate the instructor? — : 
f 2. Compare, in general, this instructor to your 


other college instructors: 


TABLE 1 


CORRELATIONS BETWEEN EVALUATIONS OF TEACHERS 
AND DESCRIPTIONS OF TEACHERS BEHAVIOR 


Group N Initiating | Considera- 
structure tion 
Freshman-sophomore | 117 —.34* .19* 
Junior-senior 154 —.05 Ais 
Graduate 107 AS .A9* 
"Total 378 —.10 T2 
*p <01. 


The answers to the above questions were on à 5- 
point scale ranging from “one of the best” (scored 
5) to “one of the worst” (scored 1). Responses to 
the two questions were summed to provide an over- 
all evaluation index. It is important to note that 
these two questions did not ask students whether 
they personally liked their instructor but rather 
focused on the ability of the instructor. 

Description of teachers’ behavior. Slight changes 
were made in the Supervisory Behavior Description 
Questionnaire's items so that they described the 
behavior of a college instructor rather than an indus- 
trial supervisor (e.g. “Does the instructor criticize 
poor work?” and “Is the instructor willing to make 
changes?”). In addition, to shorten the questionnaire, 
8 items were dropped from the consideration scale, 
leaving 20 items to measure consideration and 20 
items to measure jnitiating structure. Despite the 
dropping of the 8 consideration items, the revised 
scales yielded satisfactory internal reliabilities (rs 
.80), and the correlation between consideration scores 
and initiating structure Scores was .16, supporting 
the general finding that the two dimensions are 


independent. 
RESULTS 


Table 1 presents the product-moment cor- 
relations of teachers’ evaluations with con- 
sideration and with initiating structure for all 
students and for students classified by edu- 
cational level. It can be seen that there was à 
high correlation between teachers’ evaluations 
and consideration for all students (r= 72; 
p < 01). Further, educational level mod- 
erated this correlation, since the correlation 
between consideration and teachers’ eval- 
uations was significantly lower for graduate 
students (r = .49) than for freshmen-soph- 
omores (7=.79) or for juniors-seniors 
(r=.79) as measured by the z transfor- 
mation (p < .01). It can also be seen that the 
correlation between initiating structure and 
teachers’ evaluations was significantly neg- 
ative for freshmen-sophomores and was not 


w 
Im 
nN 


TABLE 2 


SUMMARY OF ANALYSIS OF VARIANCE FOR TEACHERS’ 
EVALUATION SCORES As A FUNCTION OF YEAR IN 
COLLEGE, CONSIDERATION, AND 
INITIATING STRUCTURE 


Source of Variation df MS F 
(A) Group year in college | 
(freshman, junior, | 
graduate) [ee 9n 3.40 «I 
(B) Consideration I si | 1093.17 | 267.93* 
(C)Initiating structure 1| 596 1.46 
AXB 2 | 3547| 893* 
AXC 2| 2892 7.08* 
BXC 1 | 4447| 10.90* 
AXBXC 2| 605| 1.48 
Within cell 366 | 408| 


* <..01, 


significantly different from zero for juniors— 
seniors or for graduate students. 

As. was previously mentioned, Fleishman 
and Harris (1962) reported that consideration 
and initiating structure interacted in predict- 


ing organizational outcomes, To test this pos- 
sibility, 
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cant interactions. The first interaction was 
between educational level and consideration. 
Table 3 shows that instructors low in con- 
sideration were evaluated more negatively by 
undergraduates than graduate students. The 
differences between undergraduate and grad- 
uate students were significant, based on post 
hoc tests (p < .05). Further, the post hoc 
tests revealed no differences in the evaluations 
of instructors high in consideration by the 
three groups. All groups evaluated high-con- 
sideration instructors equally favorably. 

The second significant interaction was be- 
tween initiating structure and educational 
level. Two points are of interest here. First, 
graduate students evaluated instructors with 
high initiating-structure Scores more favor- 
ably than ireshmen-sophomores (p< .01, 
post hoc test), Second, freshmen-sophomores 
gave significantly more favorable evaluations 
to instructors low in initiating structure than 
to instructors high in this dimension (p< 
-01, post hoc test). 

The third interaction was between consider- 
ation and initiating structure, Tt can be seen 
in Table 3 that when the instructor was high 
in consideration, he was evaluated f. 
regardless of his level of initiating s 


but when the instructor was low in 
ation, 
favora 
initiat 


avorably 
tructure, 
consider- 
he was evaluated significantly more 
bly if he was low rather than high in 
ing structure (p < .05, post hoc test). 
In short, the combination of low consideration 
and high initiating structure was the least 
favored by the students, particularly among 
freshmen-sophomores, 


Discusston 
The findings reveal 


he fir ed that the teacher seen 
as high in considerat 


i ion by his students was 
considered to be the superior teacher, For the 
most part, the perceived attempts by a 
teacher to initiate structure through planning, 
Setting goals, etc., were either given no weight 


be a negative weight in evaluating the instruc- 
or, i ‘ 5 3 

t. The interaction between consideration 
and initiating 


2 3 Structure further indicated that 
ener e of initiating structure were per- 
Wh di NN negatively when the instructor 
levels 2 ed ìn consideration, The same high 
d Mitiatin i i 

influence er 8 structure did not negatively 


uations among high-consider- 
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ation instructors. These findings are consist- 
ent with those reported by Fleishman and 
Harris (1962) and subsequently replicated 
by Cummins (1971). 

The interpretation provided by Fleishman 
and Harris (1962) for industrial supervisory— 
subordinate relations, as applied here, is 
that students under instructors who establish 
a climate of mutual trust, rapport, and toler- 
ance are more likely to accept higher levels of 
initiating structure. This might be because 
they perceive this initiating structure differ- 
ently from students in “low-consideration” 
climates. Thus, under low-consideration 
climates, high initiating structure is seen as 
threatening and restrictive, but under “high- 
consideration” climates, this same initiating 
structure is seen as supportive and helpful. 

The remarkable similarity between the 
above findings and those reported in the 
industrial setting suggests à general pattern 
in the way in which subordinates respond to 
superiors in hierarchical organizations. How- 
ever, previous studies concerned with sub- 
ordinates’ evaluations of superiors have not 
dealt directly with the status of the subordi- 
nate making the evaluation. Of interest in the 
present study is the finding that graduate 


students viewed high levels of initiating 


structure and low levels of consideration more 


rably than did freshmen-sophomores. We 


favo 
dents have made 


speculate that graduate stu 
a greater commitment than undergraduates 


to educational goals and hence perceive 
teacher-task-related activities more favor- 
ably. 

A related point of interes 
dimensions of consideration @ 
structure accounted for significantly less of 
the variance of graduate students’ evaluations 
than was the case in undergraduates” eval- 
uations. A multiple correlation of consider- 
ation and initiating structure with instructor 


t was that the two 
nd initiating 


253 


evaluation was greater than .95 among 
freshmen-sophomores, .80 among junior— 
seniors, and .50 among graduate students. It 
is probable that other aspects of teacher be- 
havior which are not measured by the Super- 
visory Behavior Description Questionnaire 
were influencing graduate students’ judg- 
ments. Such aspects might be research abil- 
ity, degree of professional involvement, and 
other as yet undefined aspects of teacher 
behavior. 
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assess the effects of college student evaluations 
rs in a feedback condition received the results 
midway through the semester, whereas pro- 


had all feedback withheld. Results indicated 
n performance betwi 
and evaluations collected at the end of the term in 


Implications of these results for utilization of student e 


een midterm evaluations 
the feedback condition. 
valuations are discussed. 


rating differences between feedback and con- 
trol groups. In both of the experiments, the 
teachers studied were not from a full-time 
college professor population, and the nature 
of the feedback given did not allow any nor- 
mative comparisons with other teachers. The 
possible motivating role of discrepancies be- 
tween a professor's subjective evaluation of 
his performance and data summarizing stu- 
dent views was also not utilized. 


Whether or not the provision of student 
feedback data chan 


may be a func 
the effectiy 
interest ma: 
normative 
estimate h 


to observable teachin 


psed during the se- 
aviors that would be 
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allow interpretation of evaluations; and 
finally, to maximize involvement, expectancy 
data were requested from the faculty. 


METHOD 


Setting 

This study was undertaken at Oakland Univer- 
sity, a state-supported school in the Detroit metro- 
politan arca with an enrollment of 7,000 students. 
While the University is selective in admissions, à 
wide range of backgrounds is represented in the 
student body even though a large portion of the 
University’s enrollment (i.e., over 70%) is comprised 
of commuting students. Faculty committees regularly 
examine data concerning student evaluation of 


^ courses. 


Subjects 
Faculty in the School of Economics and Manage- 
ment and in the Department of Psychology agreed 
to participate in the study. The experiment initially 
included 31 classes taught by 21 instructors (out of 
a total of 28 full-time faculty) with a total enroll- 
ment oí 1,830 students. Classes with an enrollment 
of less than 25 and those with little or no student- 
instructor contact were excluded. The courses ranged 
ductory sections to dual senior- 


from freshman intro u 
graduate courses and from smaller discussion groups 


to large lecture sections. 


Experimental Manipulation 
two departments were 


Faculty members in the 
experimental (feedback) 


randomly assigned to an 
group or a control (no-feedback) group. Control 


for teaching experience Was made by random assign- 
ment within low-experienced (defined as under five 
years by self-report) as well as high-experienced 
faculty pools to the experimental and control groups. 

Prior to the initial administration of the question- 
naire, faculty members had been informed only that 
they would be requested to give up à portion of class 
time at midsemester and at the end of the semester. 
No description of the purpose of the study was 
made available to them. Furthermore, only the stu- 
dent research assistant knew which faculty member 
was in which feedback status group. 

Actual administration of the questionnaire was con- 
ducted by student assistants not enrolled in the class 
being surveyed at a date and time specified by the 
instructor during à specified week. These assistants 
were instructed to read a prepared introduction and 
not to answer questions. Few difficulties in filling 
out the questionnaire were encountered, since stu- 
dents had filled out similar forms in previous 
semesters. The students were informed that instruc- 
tors would not Sce the actual completed forms. No 
identification of the student was required on the 
questionnaire. 

Responses from each 
processed in order to Prov 


class were immediately 


ride a frequency distribu- 


tion and mean score for each item to the instructors 
in the feedback group. An aggregate run of all 
responses was tabulated. Within four class periods 
oi the midsemester (ie. eighth week) sampling, 
both the individual instructor's own classes’ tabula- 
tion and the aggregate tabulation were returned to 
the experimental group. The semester was 15 weeks 
long, including a final examination week. Thus, 5 
weeks remained prior to retesting for the feedback 
information to be utilized. 

Data were obtained on the expectations which 
each faculty member had regarding student judg- 
ments of his behavior. After the administration of 
the questionnaire but prior to returning any feedback, 
the experimental subjects were requested to fill out a 
questionnaire indicating where they expected the 
median to fall for each item. They were required to 
return these forms as a prerequisite to receiving 
their feedback. 

During the last week of classes (before final 
exams), student assistants readministered the ques- 
tionnaire. The tabulation procedure was identical to 
that at midsemester. Two weeks after the end of final 
exams, all faculty were given their summary results 
and the aggregate runs from both midterm and the 


end of the semester. 


Questionnaire 


The questionnaire utilized was similar in format 
to teaching evaluation instruments used in many 
other universities (e.g., see McKeachie, 1969). It 
was composed of 23 items relating to specific be- 
havioral characteristics of faculty members which 
might reasonably be subject to change during the 
course of a semester. Thus, personality items, which 
are less responsive to change, were excluded. In addi- 
tion to overall evaluations of the course and the 
instructor, item topics included giving consideration 
to student backgrounds, giving feedback, providing 
schedules of course events, stating expectations, 
showing enthusiasm, making clear assignments, stim- 
ulating thinking, and conveying information clearly. 
Response modes utilized 5-point Likert scales and 
multiple choices with various numbers of alterna- 
tives. The subjective “best” answer was not in the 
same relative position (ie., first, last) in order to 
control for response position bias. 

The median comparison of number of responses 
on the final ratings as compared to the midterm 
was 89%. This indicates a considerable overlap in 
student raters between the two administrations of 
the questionnaire. The median product-moment cor- 
relation of responses between administrations was 
.85 for the control group and .88 for the experimental 
group. This may indicate substantial reliability for 
the questionnaire. 


RESULTS 


Two professors and three classes were 
dropped from the study due to administrative 
problems. The data are based on the remain- 
der: 27 classes and 19 different professors 
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TABLE 1 
CHANGES IN MEDIAN STUDENT RATINGS By GROUP 


Change | Expert | Control Total 
Total 63 FO SENE TT NND 
So. positive (%) — | 41 (65%) | 19 (35%) | 60 (51%) 
po posdye M | tossed (65%) | 57 (4902) 


"There were 15 classes (10 different profes- 
sors) in the experimental group that received 
feedback from the midterm evaluation and 12 
classes (9 different professors) in the control 
group. 

In analyzing the data, changes in ratings 
for experimental and control groups between 
administrations, as well as comparisons on the 
final ratings, were made. However, simple 
comparisons between groups on the final 
assessment (as suggested by Cronbach and 
Furby, 1970) were not considered appropriate 
because, despite random assignment, the 
groups were not equal on many midterm 
ratings. Median tests showed higher ratings 
for the control group on a majority of the 
midterm items, therefore, making it difficult 
to interpret simple comparisons on the final. 
Hence, changes were analyzed. 

The basic effects of the feedback are shown 
in Table 1, which represents the total number 
of positive, negative, and total changes for the 
experimental and control groups. These data 
are based on all questionnaires in all the 
classes, A change is defined as a shift in the 
median for an individual item of at least one 
Ro Point in either a positive or negative 

rection between the midterm and final eval- 


or less, this i 
© criterion. Table 1 shows that the 


TABLE 2 
CHANGES 1N MEDIAN § 
.» IN MEDIAN STÜDEN: 
OF Facunry FOR Fg cee 


SSS 


| Esti 


ATINGS 
EDBACK Group 


Change = Pai by faculty 
Over | U a 
= Jnder Total 
—R— al 
Positive | 13 e uan. QURE 
Negative | 0 i | 17 
Total | 13 18 ^s 
JE MEME LLL 
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feedback produced a strong increase in posi- 
tive changes in the experimental group, 
whereas the control group showed a strong 
increase in negative changes from the midterm 
to the final evaluation. In the experimental 
group, 41 (65%) of the total number of 
changes in median scores for each item in 
each class were positive and 22 (35%) were 
negative. For the control group, these per- 
centages were reversed. Differences in the 
number of positive and negative changes were 
analyzed by group, using a 2 X 2 chi-square 
test. The result was highly significant (y? — 
10.43, df — 1, p < .01), using Yate's cor- 
rection formula. Clearly, the feedback condi- 
tion produced a set of strong positive shifts 
in evaluations compared to the nonfeedback 
control group. 

In addition to the evaluation data collected 
from the classes, data were also collected from 
the participating professors on their expected 
evaluations for each question for each class. 
This was done in an attempt to determine 
where the maximal discrepancies occurred 
between perceived and actual performance 
and to explore the effect of these discrepancies 
on subsequent evaluations at the end of the 
term. All 15 of the subjects in the experimen- 
tal group cooperated, and their data are pre- 
sented in Table 2 in the form of a frequency 
distribution of the cases where the expectancy 
differed from the actual midterm evaluation 
by at least one scale value for a specific ques- 
tion and where there was also a change 
between the midterm and final class evalua- 
tions for that question of at least one scale 
value. (A similar analysis was planned for 
the control, but failure to comply reduced 
that sample size to only four professors.) 

There were 31 cases that qualified. The 
most striking aspect of Table 2 is that all of 
the overestimates also show positive shifts 
from the midterm to the final evaluation and 
14 of the 18 underestimates show negative 
shifts. This indicates that where an expe“ 
tancy is in error, there is a good chance tha 
there will be a subsequent shift in the eval- 
uation in the direction of the expectancy for 
that trait. This occurred in 27 of the 31 case 
(as well as in 5 out of 6 of the classes in t 
control group). This trend was examined P 
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computing a phi coefficient which yielded a 
value of .77 (y? = 23.87, df = 1, p < 01). 


Discussion 


In the study, feedback appears to have had 
an effect on improving the nature of student 
ratings given a professor. Since the data used 
to substantiate this conclusion are in the form 
of shifts in the medians obtained for each 
class of one scale point or more, the results 
must be considered in this light. A chi-square 
analysis of median shifts makes use of more 
conservative assumptions regarding scale 
characteristics than would an analysis of 
variance involving means. Ordinal properties 
of the scales are used, and shifts in extreme 
values without shifts in central tendency have 
no effect. The number of positive shifts in the 
experimental group (65%) indicates a prob- 
ability that many students near the center of 
the individual class distributions shifted their 
evaluations one or more scale points toward 
more favorable scale anchors. Comparison of 
scale medians obtained for the experimental 
and control groups for the initial evaluation 
showed no differences. 

The precise nature of the feedback condi- 
tion should be considered in evaluating these 
results. First of all, a professor was able to 
compare his own performance with that of his 
peers. He also was requested to form expecta- 
tions regarding the nature of the feedback 
data before it was presented to him. If eval- 
uations by others are to be an important part 
of the process of modification of teaching be- 
havior, as Daw and Gage (1967) hypothesize, 
then normative and expectancy data should 
enhance the effect. Unfortunately, this exper- 
iment was not designed to study the relative 
contribution of each of these parts of the 
feedback condition. This is a subject for 
future research. 

In comparison to the Miller (1971) study 
where ratings were obtained after only four 
weeks, the first student evaluations in this 
experiment were made from seven to eight 
weeks after the beginning of the semester. 
Therefore, in the present study, students were 
able to obtain more reliable information on 
instructor behavior, especially with respect to 
examinations and grading. This information 
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may have led to more reliable and valid base- 
line evaluations and could be one explanation 
for Miller's negative results. 

Of course, waiting until later in the semes- 
ter for ratings is costly in terms of time left 
for feedback and change in behavior. This 
raises the question of what kind of instructor 
behavior could be expected to change in a 
period as short as five or six weeks. The pres- 
ent study indicated that a number of observ- 
able, noncomplicated instructor activities 
could be changed in a short period of time so 
as to affect student ratings and ultimately, 
perhaps, learning. In addition to items on the 
overall evaluation of the course, specific items 
which contributed to the experimental group 
change included being open to suggestion on 
tests, discussion, and term paper topics; stat- 
ing what is expected from students; relating 
assignments to course purposes; conveying 
knowledge; explaining course objectives; and 
stimulating thinking. Discussing grading 
criteria or scheduling assignments further in 
advance are activities which require no fun- 
damental personality change. Obviously, it is 
unlikely that student feedback can be 
expected to change a professor's sense of 
humor. 

In summary, the boundary conditions under 
which feedback produced change in this 
experiment included convenient packaging of 
feedback data for peer comparison, using 
observable instructor behavior items, and 
scheduling only aíter sufficient. time elapsed 
for the student observers to gather reliable 
evidence. 

One critical difference between this study 
and both the Miller and the Tuckman and 
Oliver experiments lies in the nature of the 
subjects. This study employed full-time fac- 
ulty of all professorial levels, while the earlier 
ones used either graduate assistants or high 
school teachers, respectively. There may be 
substantial differences in the motives of col- 
lege professors compared to the others. 
Although Tuckman and Oliver obtained posi- 
tive results, findings with a sample of high 
school instructors and student evaluators are 
difficult to generalize to college populations. 

'The strength of the relationship between 
discrepant expectation and change in ratings 
was somewhat surprising (? — .77), although 


258 


McKeachie and Pambookian (personal com- 
munication, May, 1973) reported similar 
findings. A social comparison, or a disso- 
nance model, might be used to explain the 
result (e.g., see Secord & Backman, 1961). 
When an individual is exposed to a view of 
his behavior which is different from his self- 
concept, he has a choice of modifying the self- 
concept or changing his behavior (thus, ulti- 
mately modifying the views of others). A 
third choice might be to minimize the value 
of the whole procedure, The obtained phi 
correlation coefficient shows that where both 
dissonance and behavioral change took place, 
the change was in the direction designed to 
produce equilibrium. Note that the data does 
not analyze nonchangers; hence, changes in 
self-concept were not tested. The strength of 
the relationship indicates the possible predict- 
ability of the direction of 


courses. Where di 
occur, these may be due to premat 
speculations, 

The use of this model leads to a number of 
research hypotheses which this experiment 
Was not designed to study. For example, 
compared to the changers, the nonchangers 
Tay have negated the value of student feed- 
back even before they received it. Ts such an 
attitude subject to modification after presen- 
tation of student ratings? Under what condi- 
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tions does a professor’s self-concept of his 
performance as a teacher become modified? 
In general, the literature discloses relatively 
few experimental studies of professorial teach- 
ing behavior. The organizational demands, 
training, professional roles, and clients of col- 
lege faculty are sufficiently different from 
those of high school and grade school instruc- 
tors so that an experimental literature involv- 
ing college faculty needs to be developed. In 
light of a current questioning of previously 
sacrosanct traditions in higher education, the 
present authors are hopeful that faculty and 
administrators will cooperate in this endeavor. 
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ANIMADVERSION ERROR 


IN STUDENT EVALUATIONS 


OF FACULTY TEACHING EFFECTIVENESS 
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Midterm grades of students were compared to the grade the student gave 


the professor on his teaching effecti 


showed better than chance correspondence (7° 


entitled the animadversion error, and 
ratings was discussed. 


Ratings, until the fairly recent past, have 
been almost exclusively that of supervisors 
rating subordinates or job applicants. The 
academic setting is one of the few organiza- 
tional environments that tolerates, and even 
encourages, subordinate ratings of the super- 
visor’s performance (Gustad, 1961; Kent, 
1966; Labovitz & Hagedorn, 1971; Sharon, 
1970; Weaver, 1960). There is a need to 
know more about factors influencing such 
faculty ratings. 

The present study examines the relation 
between the student’s course grade and his 
rating of his instructor. 


METHOD 


Subjects 

s enrolled in a course in indus- 
trial relations taught by the senior author were used 
as raters (W — 86) during the winter quarter 1972. 
The course is a core requirement for a bachelor's 
degree in business administration. 


Business student: 


Procedure 


After the midterm examination was graded and 
passed back to the students for their review, it was 
traditional for the students to provide the depart- 
ment chairman with evaluations of the professor's 
effectivencss in teaching the course. A simple A-F 
rating scheme was used, and each student was pro- 
3X5 inch index card on which he was 


vided à : t 
instructed. to indicate his evaluation of the teacher's 


effectiveness and the grade he received on the mid- 
term examination. The professor was out of the room 
as the cards were turned in. The cards, which were 
stacked by à student and given to the professor, 
were then sorted by the student's midterm grade, 
and the distribution of grades given to the professor 
was then plotted for each grade. 
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e-way analysis of variance 
n 3). This tendency was 
its importance in subordinate-supervisor 


veness. A 


RESULTS AND DISCUSSION 


An analysis of variance was conducted with 
N = 85. (The F midterm category had an N 
of 1 and was eliminated from the analysis 
of variance. The F student did give the pro- 
fessor an F also!) An F (df = 3/81) of 8.13 
was significant beyond the .001 level („° = 
23). 

Clearly, a small but significant portion of 
the variance in the student's ratings of faculty 
teaching effectiveness is a reflection of the 
student’s midterm grade. A suitable term for 
this source of bias in rating should imply the 
mirroring back to the supervisor of his evalu- 
ation of the subordinate’s performance. Web- 
ster’s Seventh New Collegiate Dictionary 
(1967) was consulted under terms with the 
connotation of reflecting blame. Animadver- 
sion was defined as a term implying criticism 
prompted by prejudice or ill will, hence, the 
adoption of the term animadversion to 
describe the error. 

Allowing subordinates to rate their super- 
visors is a procedure that has yet to be 
adopted widely in private industry or govern- 
mental settings. Nevertheless, in settings 
where such ratings may be gathered, the 
potential impact of the animadversion uw 
should be realized. 

It is not the intent to discredit student 
evaluations of faculty, but rather to put them 
into a more meaningful perspective, One pos- 
sible administrative ploy would be to require 
faculty ratings before any examinations are 
given, thus depriving the rater of the con- 
tamination information, that is, his knowledge 
of his grade to date in the course. 

Subsequent study should evaluate this error 
at different stages of the course. One would 
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A procedure and rationale for evalu 
anchored rating scales 


ating college teaching using behaviorally 


was presented. In Stage 1 (n= 38 students), nine 


independent dimensions important for teaching evaluation and representative 


behavioral incidents were identified. 


were allocated to dimensions. In Stage 3 
a scale representing effective teaching. Items with low standard 


evaluated on 


In Stage 2 (= 54 students), incidents 
(n= 139 students), incidents were 


deviations were retained for the final scales. The underlying notions of the 


resulting scales and 
cedure relative to other procedures a 


An increasing number of articles are being 
devoted to the analysis and evaluation of col- 
lege teaching. Generally, two questions have 
been posed: (a) Is it possible to identify 
independent dimensions of “teaching ability?” 
(b) Is it possible to develop a reliable, valid 
measure of “teaching ability?” The first ques- 
tion has received a great deal of attention and 
has usually been studied through factor 
analyses (cf. Hildebrand, Wilson, & Dienst, 
1971) or by discriminant function analysis of 
students’ ratings of professors (Field, Simp- 
kins, Browne, & Rich, 1971). The results of 
these analyses, however, are open to question. 
Current student evaluation forms are often 
ambiguous, verbose, disorganized, and arbi- 
trarily developed. They consist of global 
behavioral measures and vague trait descrip- 
tions. As a result, the forms tend to be 
unreliable and very susceptible to response 
biases. Different ways of investigating the 
original two questions are necessary. 

A rigorous and comprehensive project con- 
ducted by Hildebrand et al. (1971) demon- 
strated some innovative procedures. First, two 
evaluation forms were empirically (rather 
than arbitrarily) developed whose items met 
objective criteria (e.g., each item on one 
form discriminated between best and worst 
teachers at the .001 level). Second, items were 
developed and judged by students rather than 
by faculty members. The evaluation forms, 
however, still contain vague, global behaviors 


1 Requests for reprints should be sent to Sheldon 
Zedeck, Department of Psychology, University of 
California, Berkeley, California 94720. 


the advantages of using the behavioral expectation pro- 


re discussed. 


and trait descriptions. Also, the use of a 
simple 5-point scale for each item is not a 
conducive means for eliminating response 
biases such as leniency and central tendency. 
As in past studies, independent dimensions 
were obtained from factor analysis of the 
evaluation form, However, the weaknesses of 
the evaluation form suggest that results of the 
analyses be viewed cautiously. Finally, the 
forms by Hildebrand et al. attempt, as do 
most forms, to encompass the teaching of all 
university disciplines. Vet, it is unlikely that 
all, or even most, disciplines require identical 
patterns of teacher behavior, By its very 
nature, an evaluation form for all disciplines 
is not conducive to specific behavior and trait 
descriptions. Specific items, and possibly even 
dimensions of teaching ability, that are appro- 
priate for the teaching of psychology may be 
quite inappropriate for the teaching of art, 
philosophy, physics, etc. 

One approach to the development of an 
evaluation or appraisal form that reduces the 
weaknesses inherent in most forms is the pro- 
cedure suggested by Smith and Kendall 
(1963) for the construction of behaviorally 
anchored rating scales. The procedure in- 
volves the development of dimensions and 
items of performance criteria by independent 
groups and it has several advantages for the 
development of teaching evaluation forms: 
(a) members of the rater population, stu- 
dents, construct the scales; (5) conceptually 
independent dimensions are obtained which 
elicit consensus among raters as to the con- 
struct validity and exhaustiveness of the 
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broad areas of performance that should be 
evaluated; (c) specific behavioral incidents, 
which retain student terminology, are used as 
anchor points on each dimension scale and 
thereby eliminate gross performance descrip- 
tions and many response biases; and (d) in 
actual use, students document ratings with 
specific incidents that they have observed, a 
procedure which favors honest and conscien- 
tious ratings. The purpose of the present 
study is to develop behaviorally anchored 
rating scales for the evaluation of the teach- 
ing ability of psychology professors, 


5 METHOD 
Subjects 


Subjects were 231 male an 
students at the University 
each of whom was enrolle 
chology course, lower or uj 
ticipating in the study, 
psychology majors and n 


d female undergraduate 
of California, Berkeley, 
d in at least one psy- 
pper division, when par- 


The sample included both 
onmajors, 


Procedure 


Stage 1; Generation of dim 
= 38) consisted 


task, two volu 
convened with the first 
the work of the 


good, average, or po 
each dimension, 

The purpose of Conference 2 
review the dimensions obtained in 
to supplement the work. į 
dimensions or behavioral examples 

Stage 2: Reallocation of behaviors, An indəpe 
group of students (n = 54) was provided with » ^ 
of the behavioral examples (randomly 
a list of the dimensions and definitions 
Stage 1. Students assigned each item to 
sion that the example was thoy 
Examples that were not assigned t 


(n= 10) Was t 

o 
Conference 
» H1 necessary, with iti 
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60% of the students were eliminated from NAM 
analyses. Effectively, ambiguous examples m 
eliminated because of lack of i d as to w 
imensions the examples illustrated. » 
— 3: Assignment of values, Students (n= ii 
Were provided with dimension lists with payee à 
ing items that met the criterion established in ERE 
2. Scale values ranging from 1 to 7 (reflecting. very 
poor to very good performance) were assigned by the 
Students to each example in the respective dimension. 
Items with standard deviations greater than 1.50 
(the same criterion used by Fogli, Hulin, & Blood, 
1971, and Smith & Kendall, 1963) were eliminated 
from further consideration. Effectively, the retained 
examples elicited agreement as to the type of per- 
formance illustrated and the degree to which the 


behavior represented poor to good performance on a 
7-point scale. 


The behavioral exam 
in the final evaluati 
actual behaviors t; 
Statement such as 
read chapters 3,4 


ples that were used as anchors 
on form were reworded from 
© expected behaviors. That is, a 
"This professor tells the class to 
: and 5 of the text and then lec- 


ing the scales, central tendency or judging effects 
should be minimized, a 

SO verifiable that judgment, and 
val if subsequent ratee 


at the rater should 
compare the types of behaviors 
anchors with similar behaviors 
as either demonstrated in the past 
expected to demonstrate in the future. 


illustrated in the 
that the ratee h 
or could be 


RESULTS AND Discussion 

The synthesis of the four groups’ work in 
Conference 1, Stage 1, resulted in seven di- 
mensions: depth of knowledge, delivery, 
organization, inter persona] relationships with 
students, rele 
inspiration 
behavioral ey; 


havior and specific 

ere usually combined in 
ne of the ta 

Was to dey 

new dimen; 


ie 
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TABLE 1 


DESCRIPTIVE STATISTICS OF DIMENSIONS 


| No. items | Range of SD 


Dimension s 
s afterStage2 | forStage 3 


! 


Depth of knowledge 20 91-1.50 
Delivery | 3 85-151 
Organization 20 | O4-1.60 
Interpersonal relations | 

with students 42 | 
Relevance 20 
‘Testing 3l | 
Grading 25 | 
Assignment and work load 18 
Ability to motivate 23 .91-1.39 


No.items | | Range of 
which met | No. anchors Miu on | Range of SD 
11.50 criterion) onfinalscale| final scale jon final scale 

19 | 9 144-635 | .91-1.36 
31 | 9 | 145-640 .85-1.00 
18 | 7 1.90-6.21 1.00-1.13 
4a | 10 | 132651 
15 8 1.38-6,06 
21 9 1.71-5.83 | 
13 8 1.63-5.97 
16 | 8 1.65-6.09 | 85-1. 
23 10 1.60-6.35 .96-1.39 


Conference 1’s work indicated that still 
another dimension, assignments and work 
load, should also be added. Each student in 
Conference 2 provided three critical incidents 
for this dimension as well. 

A review of the pool of all examples gen- 
erated in both conferences resulted in the 
modification or elimination of vague, global, 
or incomprehensible items and the addition 
of extra items, especially items illustrating 
ordinary or mediocre teaching performance. 
The final pool consisted of 310 critical inci- 
dents. After Stage 2 (reallocation of be- 
haviors), the number of behavioral incidents 
was reduced to 231. After Stage 3 (assign- 
ment of values), the number of behavioral 
items was reduced to 199. The number of 
behavioral incidents which appeared as 
anchors on the final form were 78. Descriptive 
data appear in Table 1. 

Table 1 indicates that nearly all the items 
in each dimension met the standard deviation 
criterion of 1.50, Many items with standard 
deviations below 1.50 were not included in the 
final scales because their mean scores were 
similar to other items (with standard devi- 
ations below 1.50) on the same dimension. 
The items that were selected for the final 
scales represented a wide range of behaviors 
in terms of mean values and met subjective 
Criteria of brevity, clarity, and contrast in 
subject matter with other items. 

The interpersonal relationships and rele- 
vance dimensions each required the accepting 
of one item with standard deviations of 1.54 


and 1.52, respectively, so as to provide 
anchors for the scale areas representing 
mediocre performance. As Landy and Guion 
(1970) observed, the problem of obtaining 
critical incidents for ordinary or mediocre 
performance often arises when developing 
behaviorally anchored scales. In the present 
study, students in Stage 1 had difficulty gen- 
erating acceptable critical incidents for medi- 
ocre performance. Furthermore, results of 
Stage 3 showed that a given dimension would 
invariably contain few items which had mean 
values between 3.5 and 4.5, and even those 
items often had larger standard deviations 
than items that received mean values below 
3.5 or above 4.5. As an extreme example, 
there are no anchors between the means 2.88 
and 5.65 in the organization dimension be- 
cause the standard deviations of the few items 
which fell within that range of means were 
simply too large to be acceptable. This diffi- 
culty in obtaining behaviors of medium value 
may be a function of the instructions and the 
connotations of the words “average” or 
“mediocre.” Though students were asked for 
examples of good, poor, and mediocre per- 
formance, the request was for critical inci- 
dents. Emphasis on critical incidents may 
preclude the opportunity for noncritical, 
mediocre examples. Perhaps other adjectives 
than average or mediocre should be used, for 
example, satisfactory or acceptable; or an- 
other strategy may be to ask for examples 
that range from good to poor without specify- 
ing a midpoint. 
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TABLE 2 


INTERPERSONAL RELATIONS WITH STUDENTS—THE PROFESSOR'S Rapporr 
WITH AND SENSITIVITY TO STUDENTS 


p 


| 
| 
Es 


6— 


pet 


Table 2 shows one scale w 
the final evaluation forms—i 


— | professor could be expected t 


P the campus where he does his 


// | more efficiently than pres 


ee . "m 
{ When a class doesn’t understand a certain concept or feels “lost,” this professor could be expected to sen: 
Lit and act to correct the situation. 


This professor could be expected to answer the student's questions about 


learning and conditioning without 
(making the student feel stupid and without making the 


student feel that he's bothering the professor. 
er class, this statistics 
ust begin. 


"This professor, when a student comes to 
of the material and tell the stud 
troubles understanding the mate 


( When confronted with questions aft professor could be expected to stay and talk to 
(the students until the next class m 

his office for help, 
ent to read certain ch: 
rial. 


could be expected to 


go through one explanation 
apters of the text 


and to come back if he still has 


(During lectures, this professor could ofte 


n be expected to tell students with questions to see him during his 
office hours. 
If a student asks this statistics professor to help him with “w 
o say that he has no time becaus 


"tables a few da: 


ys before the final exam, this 
€ he is ve 


TY busy composing the exam, and to 
tell the student to ask a TA, 


N. us professor could be ex; 


Xpected to not see students individually, his regularly scheduled 
office hours. E 


i is never in his “ont: ech a tas " 
This professor is never in his official office.” He could be expected to maintain his office in another part of 
research and in or 


a. der to learn of its wi ereak k him in- 
dividually. hereabouts, students must ask him 


Tn this experimen 


tal psychology 
and tells th p 


class, if a student a 
€ professor that he j 


5 interested in devi 
ent methods, the 


Pproaches this professor aft 


sing an apparatus t 
professors 


€r a lecture on visual-search 
hat will measure visual-search time 


care-if-you-do-it-or-not » attitude could be expected to be an “T-really-don’t- 


` This professor could be expected to try to humili 


ate or embarrass students who disagree with him. 


hich appeared in 


are divided with respect to the types of per- 

iet: oii nterpersona] rela- scree certain grading and/or testing pro- 

fe co ie the lesting and gradi cedures reflect. On the other hand, there was 
dimensions may 


standard deviations indic 


: ng substanti 
be appropriate. The resu 4 EUM wins 


A , *amined in future research. 
ate that students general 


* Complete set of scales D: 
second author. 


ection of the scales pei 
. ah € two components to teach- 
Fe hg The scales of depth of knowledge 

ay, Organization, interpersonal relations, 


= 


SCALES FOR EVALUATION oF FacuLTY TEACHING 265 


relevance, and inspiration and motivation 
could be considered intrinsic factors. That is, 
these components are the input of the instruc- 
tor; the what and how he wants to make the 
course. In contrast, the scales of testing, grad- 
ing, and assignments and work load could be 
considered extrinsic components. These scales 
are the outputs to the student; the relatively 
tangible product the student receives. 

Of relevance to this categorization is data 
provided by faculty (= 10) in the Psy- 
chology Department. Faculty were requested 
to provide values to the items (same task 
required of students in Stage 3) developed by 
the students. Rank-order correlations between 
student and faculty values for only those 
items retained on each scale indicated 1.00 or 
near 1.00 correlations on all scales, except 
testing, grading, and assignments and work 
load. In other words, faculty and students 
agreed on the rank-order value of the be- 
haviors for the intrinsic components but dis- 
agreed on the value of the items for the 
extrinsic scales. 

In general the form is concise, clearly 
organized, and since the student provides 
only nine ratings, easy to fill out, check, and 
review. The form can be supplemented with 
space for an overall evaluation of the profes- 
sor or with questions concerned with the 
characteristics of a particular course. Con- 
versely, the form may be shortened: a dimen- 
sion(s) may be deleted from the form when 
it is inappropriate; for example, students 
rating a professor who does not give exam- 
inations would use a form with only eight 
dimensions. 

In addition, all of the examples obtained 
through this procedure can be used as stan- 
dards of good, mediocre, and poor teaching 
in a training course. The examples with low 


standard deviations reflect agreed upon be- 
havior and can form the basis for a much 
needed course in training of future college 
instructors. 

In conclusion, the literature suggests: (a) 
Students are competent and mature raters 
who separate good teaching from showman- 
ship and popularity (cf. Kent, 1966). The 
present study seems to support this general- 
ization. (5) Efficient and scientific assessment 
of a professor's teaching is impossible without 
systematic student opinion as a primary 
source of the assessment (Slobin & Nichols, 
1969). We agree and suggest that student 
opinion be a key input for evaluation of fac- 
ulty teaching and for development of scales 
for this purpose. 
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SHORT NOTES 


EFFECTS OF SIZE OF SAMPLE ON EIGENVALUES, OBSERVED 


COMMUNALITIES, AND 


LAWRENCE M. 


FACTOR LOADINGS? 


ALEAMONI? 


University of Illinois 


Factor analysis is a technique often used in 
pplied psychology to identify the underlying di- 
ee in a domain of variables. Thus, when 
.— dealing with a new aptitude domain, the corre- 
lations obtained among a large number of tests 
may be explained in terms of a relatively small 
number of extracted factors which presumably 
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the finite population, The 15 variables consisted 
of five orientation test scores ( 


reading comprehension, vocab 
formation, and numerical), fou 
Testing Program standardized 
lish, Mathematics, Soci 


English grammar, 
ulary, general in- 
r American College 
test scores (Eng- 


al Studies, and Natural 
Science), four high school grades (English, mathe- 


matics, social studies, and natural science), par- 


ents’ educational level, and MSU fall-term grade 
point average, 


A computer program was written 
randomly select (without replacemer 
ples from the finite population over 
levels of N, Three sets consisting 
each were drawn 


and used to 
nt) 36 sam- 
five different 
of 10 samples 
with Ns equaling 17, 25, and 
100; one set consisting of five samples was drawn 
with each N = 400; and one sample with N = 
1,600 completed the sampling. These sample lev- 
els were chosen in order to represent as wide a 
sample range as Possible in order to detect where 
differences in factor structures might occur, The 
correlation matrices, Principal-component fac- 
tors, and quartimax and varimax rotated factors 
Were obtained from all the samples and the popu- 
lation, 
al items should be noted at this 
ies were used as the communal- 
each test, Second, six factors 


Were established as the maximum by using the 
Kiel and Wrigley (1960) criterion on the finite 


Population data. Third, the observed communality 
Was calculated as the proportion of variance of 

accounted for by the factors, bY 
Squares of the factor loadings for 


] All Sample factor solutions were first compared 
" that of the Population by the root mean square 
Ceviation and the coefficient of congruence 25 


found in Harman (1967, np. 269-272). 
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TABLE 1 


MEANS AND STANDARD DEVIATIONS OF EIGENVALUES FOR THE FIVE SETS OF SAMPLES 


Sample size 


Factor N = 2322 | N = 1,600 N = 400 N = 100 N=25 N <A 

M c M & | W [4 M c M c M c 
1 6.44 6.54 6.50 | .20 6.49 | .73 6.91 | .57 6.99 | 1.04 
2 1.54 1.55 1.58 | .04 141 | :22 2.00 | .28 | 223| .20 
3 1.32 1.33 131| .17 | 135| .16 | L51| 20 | 1.68] 31 
E: .95 .93 .98 | .04 1.04| .07 1.09 | .10 1.13 | .16 
r? 7 "86 85{ 08 | .85| .10 | .84| .09 | ga] i4 
6 ‘60 158 .63| .03 | .66| .08 | <69] 07 | .63| 09 
Total 11.72 11.79 11.85] .05 | 12.10) .36 | 13.05] .23 | 13.50] .32 

© variance 4 
accounted for 78 78 79 81 87 | 90 


Additionally, a rank-order correlation was used population factors and then applying that same 
to determine the similarity of factors obtained cutoff level to each of the corresponding sample 
in various samples with corresponding factors in factors (as determined by the root mean square 
the population. This was accomplished by select- and the coefficient of congruence solutions) and 
ing a particular factor loading cutoff level that observing how the size and position of the salient 
would define the salient variables in each of the — variables compared. 


TABLE 2 


MEANS AND STANDARD DEVIATIONS OF OBSERVED COMMUNALITIES FOR THE FIVE SETS OF SAMPLES 


Sample size 
Variable | N 22322 | N 1600 | XN =400 N = 100 N = 25 N=17 
2 Mes 2 M» c M^ c AM. c Me c Me a 
1 Ju 76 Wh .03 81 .01 .84 | .05 .89 .05 
2 E 73 NE .02 45 .06 85 .06 .90 .05 
3 79 79 .80 .01 .80 .05 .91 .03 .91 03 
4 JH T 48 | .02 .80 | .04 .88 | .06 .91 .06 
5 86 84 85 | .00 86 | .03 87 05 90 05 
6 77 77 78 | .01 81 | .03 86 05 88 06 
7 86 83 85 | .02 86 | .03 89 02 91 06 
8 73 74 73 | 02 74 | .05 84 05 90 05 
9 76 75 75 | -02 78 | .04 90 02 | 91 06 
10 66 67 72 | .05 49 | 13 86| 05 | of | .05 
11 72 72 73 OL 78 .05 86 03 91 03 
12 87 76 84 06 .82 | .06 89 06 | .90 06 
13 89 86 16 12 81 11 86 06 86 06 
14 1.00 99 99 | 01 96 | .02 90 | .05 93 | .03 
15 56 77 75 14 74 | .08 84 08 88 08 
Total | 11.73 11.75 | 11.83 | 1211 | 13.07 | 13.50 | 


aa = - - = » - 


^ Solution based on a single group. 
» Mean of solutions for 5 samples. 
° Mean of solutions for 10 samples. 
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TABLE 3 


AVERAGE Roor MEAN Squares (RMSs) anp 
COEFFICIENTS OF CONGRUENCE (CCs) 
FOR THE FIVE SETS OF SAMPLES 


Principal — s 
Sample | component Quartimax Varimax 
size 7 E 
RMS | CC | RMS| CC | RMS | CC 
1,600 046 | .965 | .049 | .976 | .050 | 984 
400 -109 | .866 | .096 | .937 | .099 | -952 
100 160 | .777 156 | .873 | .137 | 918 
25 -214 | .600 | .244 | .784 -212 | .835 
17 241 | .633 268 | -746 | 231 | 804 
REsULTS 


he six factors 
e ata accounted for 
78% of the variance While the six factors derived 


Suort Nores 


The average root mean squares and coefficients 
of congruence are presented in Table 3 for the 
unrotated principal-component solution and the 
quartimax and varimax solutions. The coefficients 
of congruence show a steady decline as the sample 
Size decreases, indicating that factor similarity 
declines with smaller sample size. Accordingly, as 
the sample size decreases, the root mean squares 
increase, indicating that the sample factor struc- 
ture shows the largest deviation from the popu- 
lation factor structure for small samples. 

The comparison of varimax and quartimax co- 
efficients of congruence in Table 3 indicates that 
(except for the N = 1,600 and N = 400 samples) 
the varimax rotational solution is the one provid- 
ing the better approximation to 
factor structure. This confirms earlier indications 
(Harman, 1967, pp. 312-313) that varimax so- 
lutions are likely to yield a more factorially in- 


variant set of factors than the quartimax solu- 
tions, 


the population 


_ Using the root mean squares 

cients of congruence to identify corresponding 
factors between the Population factor matrix and 
the sample factor matrices, Spearman rank cor- 
relations (c= .05) were used to compare the 
salient variable positions on each of the corre- 
sponding factors, Table 4 presents the average 
number and range of sample factors which were 
judged to be dissimilar (Spearman rank correla- 
tions, with p> 05) from the Population factors. 
Inspection of the factors which led to the data 
in Table 4 indicated that tl 
varimax rotations yielded 
the 1,600 and 400 samples 


and the coeffi- 


he quartimax and 
identical factors for 
s With an average of 


25 5 wi i only one or two factors showing dissimilarity for 
fit to the Ba oa N = 17) indicate that the the N of 100 and increasing to an average of 
communalitiee Tos eigenvalues and Observed three dissimilar factors for the N of 17, The un- 
creases, 5 as the sample size qe. rotated principal-component solution, on the other 
hand, began to show increasing dissimilarity of 
M TABLE 4 
AVERAG] {i " 
E Nuser Anp RANGE OF SAMPLE FACTORS WHICH WERE Diss 
———— FROM THE Doppr, o ERE DISSIMILAR 
————— S w z ATION FACTORS 
i Sample size i 
osi N = 1,60 i - 
solution , N= — — — 
ier = N = 100 | N = 25 N= 17 
See Range Average Range Avera E E 
Quartimax 1.0 0-1 ia ` ge | Range Average Range Average | Range 
Varimax 1.0 0-1 ; = | 21 14 
í oe} oe 8 | 05 29 244 2.7 1-4 
Unrotated hi 04 22 ji e 0-3 31 2-4 2.6 1-4 
a > es ME hM 2-5 3.9 26 3.5 2-6 
pee eer lec li OF eae Gee 


E 
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factors from the V of 400 (over two dissimilar) 
to the N of 17 (around four dissimilar).* 


DISCUSSION 


If one assumes that the population factor 
structure (here defined as consisting of six fac- 
tors) can account for the optimum level of fac- 
tor variance for a given set of variables, then the 
increase in factor variance accounted for by the 
six factors of the smaller samples indicated that 
we may be accounting for more error variance 
with decreasing sample size. The decreasing fac- 
tor similarity as the sample size was decreased 
(to Ns of 100 and below), accompanied by the 
actual increase in total factor variance accounted 
for, indicates that the increased variance ac- 
counted for should be considered as error. How- 
ever, the study indicated that if we want to use 
sample factor structures as a basis for generaliz- 
ing to their corresponding population factor 
structures, drawing random samples of N = 400 


3A comparison of the correlation matrices ob- 
tained from each sample to the matrix obtained 
from the population revealed the expected increasing 
dissimilarity with decreasing sample size. The average 
difference between sample and population correla- 
tions for JN = 1,600 was approximately .01, whereas 
for N = 17 it was approximately .18. 


is adequate for generalizing to a population of N 
= 2,322. 

Future studies utilizing sample sizes between 
AN —100 and N=400 are needed in order to 
specify the exact minimum sample size required 
to reflect the population factor structure. It may 
well be that Ws smaller than 400 (but certainly 
larger than 100) will be adequate. 
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M, and performance can be expected for these 
ability groups. 


METHOD 
Subjects 


One hundred and thirty-eight boys and 157 girls 
in the seventh grade in a middle-sized Norwegian 
town were tested for n Ach and ability in the fall of 
1966. 


Assessment of the Motive to Achieve Success 


The measure of M, was the thematic apperceptive 
score of n Ach (McClelland, Atkinson, Clark, & Low- 
ell, 1953). The administration followed the standard 
procedures described by Atkinson (1958, Appendix 
III). Six pictures were shown to elicit stories.* 


Assessment of Ability 


Ability (which is assumed to indicate P.) was 
measured by the score on a group test of mental 
maturity (IQ) (Sandven, Rand, & Nordgaard, 1952). 
'The product-moment correlation between n Ach and 
IQ was .06 for boys and .21 for girls. 

The pupils were divided into high-IQ groups (boys, 
1735; girls, 1 — 26), moderate-IQ groups (bo 
n= 74; girls, n= 104), and low-IQ groups (boys, 
n= 29; girls, n — 27). 


Assessment of School Performance 


Marks obtained on an examination at the end of 
a semester in the written subjects arithmetic, Nor- 
wegian composition, and English served as criteria of 
school performance. These marks were also summed 
as a total performance score (ie. sum score). 


RESULTS AND IMPLICATION 


Figure 2 illustrates the partial coefficients of 
correlation between n Ach and sum scores for 
different ability groups. It is seen that the pat- 
tern of results was in accordance with the hy- 
potheses. Also the correlations between n Ach and 
the separate grades followed this pattern very 
well for girls. The partial coefficients were as 
follows: high-IQ girls, .26, .30, .36 (p < .10, p < 
10, p < .05, respectively) ; moderate-IQ girls, .02, 
.03, .00 (5 7.10); low-IQ girls, —.12, —.01, .09 
(p> .10). For boys the partial coefficients with 
separate grades were as follows: high-IQ boys, 
05, —.11, —.04 (p> .10); moderate-IQ boys, 
27, 47, 18 (p € 01, f «.10, p<.10, respec- 
tively); low-IQ boys, .13, .09, 28 (p> .10, p> 
.10, p < .10, respectively). 


? The pictures were chosen and the stories were 
scored by university lecturer Lise Vislie, whose re- 
sults on the practice materials presented by Atkin- 
son (1958) correlated about .90 with those of an 
expert. The author would like to express gratitude 
toward her, 
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Fic. 1. Hypothetical curves showing sex differences 
in aroused motive to achieve success (M.) as a 
function of ability level. (Abbreviations: T, —ap- 
proach motivation; P, = probability of success.) 


The class in which ability and sex are hetero- 
geneous seems to favor achievement-oriented ac- 
tivity for average-ability boys and high-ability 
girls. Converting the other results into educa- 
tional policy, the following implications might 
be sketched: (a) Bright boys need to be more 
stimulated, and the curriculum must be more 
challenging since these boys are not inspired to 
utilize their motive to achieve. (5) Low-ability 
boys and low- and average-ability girls are not 
stimulated to utilize their motive to achieve prob- 
ably because they find the demands too heavy. 
Hence, the curriculum must give these pupils 
opportunities to succeed. This should increase 
their expectations for success, and a resulting in- 
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Fic. 2. Partial coefficients of correlation. between 
need for achievement and school performance for 
high-, moderate-, and low-IQ groups. 
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crease in positive interest leading to enhancement 
of performance should occur. 
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formance and also to provide an estimate of the 
magnitude of such effects. 


METHOD 
Subjects 


The subjects for this investigation were 440 
college students, 220 males and 220 females, en- 
rolled in first-year psychology courses at Texas 
Tech University. They received extra course 
credit for participation in this experiment. 


Apparatus and Procedure 


The subjects performed the monitoring task 
individually, although a maximum of four sub- 
jects could be tested simultaneously. Upon enter- 
ing the testing room, the experimenter, a 25-year- 
old male, instructed the subjects to remove their 
watches and not to place their hands on the dis- 
play in any way. The subjects were then seated in 
three-sided cubicles and told that instructions 
would be presented over a loudspeaker. After an 
initial preadaptation period of 2 minutes, during 
which time the subjects merely relaxed, instruc- 
tions were presented regarding performance of 
the task. The subjects were then given a 5-minute 
practice period during which signals occurred at 
the same rate as that presented during the regu- 
lar monitoring session. Following the practice 
session, the experimental monitoring session be- 
gan. 

The task of each subject was to monitor a 
visual display for 1 hour in order to detect 
aperiodic signals occurring against a background 
of discrete, regularly occurring events. An event 
was defined as the apparent movement of a dot of 
light .32 centimeters in diameter. The dot moved 
downward 1.58 centimeters, returned to its origi- 
nal position, and then repeated the sequence. The 
dot remained in the downward position for .3 
seconds, returned to its original position for .3 
seconds, deflected again for .3 seconds, and then 
returned to its original position for the remainder 
of the event interval. Event rate was set at 30 per 
minute. A signal was defined as an increase in 
the magnitude of the second deflection from 1.58 
centimeters to 2.22 centimeters. The display used 
to produce the apparent movement was an IEE 
one-plane readout. Signal rate was set at 24 per 
hour. The sequence of signals was presented by 
paper tape which controlled the operation of 
automatic programming equipment. 

Signals, responses, and the event number were 
recorded on a BRS-Foringer digital printout 
counter, The response key was a miniature push 
button placed in the end of a bicycle handgrip. 
Throughout the monitoring session, low-level 
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Fic, 1. Mean percent correct detections as a function 
of sex and time on task. 


white noise was played through the loudspeaker 
from recorded tape. Two response measures were 
obtained—percent correct detections and the 
number of false alarms. A correct detection was 
defined as a response occurring within 1.6 seconds 
after the presentation of a signal. 


RESULTS 


The data for percent correct detections are 
presented in Figure 1. Following an arc sine trans- 
formation, the data were analyzed by a split-plot 
factorial analysis of variance having one between 
measure (sex) and one within measure (time on 
task). Significant main effects were obtained for 
sex (F = 18.76, df = 1/438, p < .001) and time 
on task (F = 97.88, df = 2/876, p < .001), but 
not for their interaction (F — .21, df = 2/876). 
A Tukey’s honestly significant difference HSD 
test for pair-wise comparisons indicated all dif- 
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Fic. 2. Mean number of false alarms as a function 
of sex and time on task. 
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ferences among time blocks to be significant 
01). 

E for the number of false alarms are 
presented in Figure 2. Significant effects were 
obtained for time on task (F = 35.56, df =2/ 
876, p<.001), and the Sex x Time on Task 
interaction (F = 3.12, df = 2/876, p <.05), but 
not for sex (F = .70, df = 1/438). A test of sim- 
ple main effects indicated significant differences 
between males and females (p < .05) to occur 
only during the first 20-minute time block. For 
the time on task effect, a Tukey’s HSD test indi- 
cated only the differences between the first and 
second, and the first and third time blocks to be 
significant (p < .01). 

A set of F tests were performed in order to 
assess differences in variability between males and 
females. No differences were obtained for detec- 
tion performance, However, for the false alarm 
measure, females were found to yield significantly 
Ko- 01) greater variability than males. 

In order to estimate the magnitude of the 
obtained effects, point-biserial correlations. were 
computed between sex and the performance mea- 
sures. Coding males 0 and females 1, the obtained 
point biserial with detection performance was 
—.20. The obtained correl 


[ ation with false alarms 
during the first 20-minute time block was 07. i 


-- 
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power for detection performance is only .19 (Co- 
hen, 1969). It is interesting to note that for the 
literature reviewed, 20% of the studies obtained 
significant effects due to sex. Such a value is rela- 
tively near the estimated average power of these 
studies. It is suggested that the failure of previous 
researchers to obtain significant differences duc to 
sex may be attributed to the insufficient power of 
their designs for the detection of such low mag- 
nitude effects. i 

While the effects due to sex may be small, it 
often happens that the magnitude of the effect in 
which the experimenter is interested may like- 
wise be small. Any reduction of the unexplained 
variance, through matched groups or a suitable 
statistical control procedure, will certainly en- 
hance the power of his design. To the extent that 
the present results are replicable, a 10% differ- 
ence in detection performance, while not large, 


may be of some value in certain types of practi- 
cal monitoring situations, 
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THE EFFECT OF HEREDITY ON ATTITUDES TOWARD 
ALCOHOL, CIGARETTES, AND COFFEE 


ARNON PERRY : 


Tel Aviv University, Tel Aviv, Israel 


Using the twin method, the present study examines the relative effect of hered- 
ity and environment on attitudes toward alcohol drinking, cigarette smoking, 
and coffee drinking. Results support a significant genetic factor in the attitude 
toward alcohol drinking, but not in attitudes toward cigarette smoking and 


coffee drinking. 


The question of the relative effect of heredity 
and environment on human characteristics has 
been of interest to biologists and psychologists 
for a long time. In the past, the focus of research 
on the nature-nurture question has been on 
physical characteristics and intelligence, but re- 
cently personality traits and other aspects of hu- 
man behavior have come into focus. The present 
study examines the relative effect of heredity 
and environment on attitude. 

In a study of the inheritance of alcoholic drink- 
ing, coffee drinking, and cigarette smoking be- 
havior, Partanen, Bruun, and Markkanen (1966) 
found that of the alcoholic drinking variables, 
density—a combined factor of frequency and 
regularity—and amount showed a significant 
heritability, while both coffee drinking and cig- 
arette smoking showed a significant heritability 
only in the amount consumed. Perry (1971) 
found significant results for cigarettes and coffee 
consumption, but not for alcohol. Since some 
evidence exists that the consumption of alcohol, 
cigarettes, and coffee has a significant genetic 
factor, it was decided to examine the attitude 
toward these products. 


METHOD 


The research design calls for a sample of 
monozygous-identical (MZ) and dizygous-frater- 
nal (DZ) twins. The underlying principle of the 
twin method is that MZ twins have identical 
genotypes; therefore, any observed dissimilarity 
within pairs must be due to environmental factors. 
Dizygous-fraternal same-sex twins, while on the 
average differing in 50% of their genes, provide a 
measure of environmental control not otherwise 
possible by virtue of sharing such factors as birth 
rank, mother's age, etc. Once the within-pairs 
variances (Vyz and Vpz) have been calculated, 
the twin method allows us to isolate the effect of 


! Requests for reprints should be sent to Arnon 
Perry, Leon Recanti Graduate School of Business 
Administration, Tel Aviv University, Tel Aviv, Israel. 


heredity according to the following model: 


Vpz = Venvironment + Virercatiy 1] 
Vaz = Vznvironment [2] 
Vpz — Vaz = Vueredity [3] 


The most frequently used statistics in twin stud- 
ies is Holzinger's (1929) Heritability Coefficient 
(H). The ratio is 


H = (rwuz — rwoz)/(L — rwpz) 


or [4] 


(woz — e^wwz)./c^wpz 


where zy is the within-pairs product-moment 
correlation coefficient for MZ and DZ twins, pro- 
vided each twin is entered both ways. The same 
ratio can be computed by working with the vari- 
ances for MZ and DZ twins; that is, 
a Da z 
UA Ere [5] 
where d is the twin difference and z is the num- 
ber of families in the sample. The range for H 
is between O and 1, where 1 indicates complete 
control of the genetic factor on the trait and 0 
indicates complete control of the environmental 
factor. The test of significance is computed ac- 
cording to the following F ratio: 
F = o woz cvy 


[6] 
with zpz and zz degrees of freedom. 


The assumptions embodied in this model are 


1, There is no genotype-environment interac- 


tion. 

- The genetic and environmental factors are 
additive. 

3. The degree of environmental similarity, or 
press, within MZ and DZ pairs is the same. 


N 


In every twin study, one of the major ques- 
tions is how to classify twins into zygosity groups. 
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TABLE 1 
Means, SiGMas, CORRELATION COEFFICIENTS, AND HERITABILITIES — = 
| | z Hoy GE 
Variable | Xuz | Xp» | asz | opz TMZ | d | E NEST 
| : | | s 2.03* 
Alcohol 85.5 860 | 142 | us | 66 A | t 29 
Cigarettes 60.9 63.1 156 | 3.0 | 26 | - 4 | 1 
Coffee 101.4 100.4 8.2 | 7.8 | 2 | R : | 
Note, Abbreviated: MZ = monzygous-identical twins, DZ = dizygous-fraternal twins, 
*p = 05. 
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know," they were classified as DZ. The latter ?. Drinking alcohol is a risky but rather en- 
method was used in this Study. In addition the joyable habit, " 
subjects (mostly college Students) were asked 3. The risk involved in drinking alcohol 35 
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ts of twins Participated in the 
study, 46 MZ and 38 DZ twins, : à " 
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Drinking alcohol is unsafe. 
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19. The potential danger in drinking alcohol is 
not worth the pleasure. 

20. If you know how to handle yourself, there 
is no risk in drinking alcohol. 


The same questions were repeated for cigarette 
smoking and coffee drinking. The sum of the 
scores on the 20 statements was computed for 
each subject; however, for some of the state- 
ments the direction was reversed, on the basis of 
an item analysis that was performed on a sepa- 
rate sample. 


RESULTS AND CONCLUSIONS 


The means, sigmas, correlation coefficients, and 
Holzinger's (1929) H and F ratios are presented 
in Table 1. 

This study supports the concept of heredity, as 
applied to attitude, as a viable concept. The ge- 
netic factor accounted for 5196 of the variation 
in the attitude toward alcohol. It is possible that 
a significant genetic factor will be found in the 
attitude toward other products or even abstract 
concepts or ideas. However, it should be pointed 
out that because the variance in the attitude 
toward a concept or a product can be largely ex- 
plained by the genetic factor does not mean that 
all environmental efforts to influence should 


cease, It means that the genetic factor should be 
taken into account when allocating limited re- 
sources for environmental efforts and that greater 
influence could be exercised in situations where 
the genetic factor is small as compared to those 
where it is large. Further study is needed in this 
area, especially in trying to link heredity, be- 
havior, and attitude. 
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QUESTIONNAIRE LENGTH AND RESPONSE RATE: 


DOUGLAS R. BERDIE 2 


Measurement Services Center, University of Minnesota 


This study examined the relationship between questionnaire length and re- 


sponse rate. A stratified random sample of 108 university 


one-page, two-page, 
relationship appeared 


Because of difficulties in gathering data through 
mailed questionnaires (foremost of these is the 
low rate of response), experimenters have tried 
many methods to increase the utility of question- 
naires as an information-gathering device. Studies 
have been conducted in regard to (a) question 
construction, (b) physical appearance of the 
questionnaire, (c) various postal procedures, and 
(d) other aspects of questionnaire methodology 
that aim to increase the response rate and, conse- 
quently, the validity of the study. 
~ Common sense suggests that th 
questionnaire, the more likely a 
rate, and persons studying ques 
ciency have tended to accept this belief in spite 
of little empirical evidence to support it, A re- 
view of questionnaire techni 


e shorter the 
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tionnaire eff- 
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between questionnaire length and response rate, 


attempted to keep the questionnaires as short as 
Possible [p. 28].” Let us look, therefore, at the 
actual reports of Sletto and Parten. 

In the Sletto (1940) study, respondents were 
sent 10-page, 25-page, and 35-page questionnaires. 
Of those receiving 10-page questionnaires, 68% 
responded; of those receiving 25-page question- 
naires, 60% responded; and of those’ receiving 
35-paze questionnaires, 63% responded. Obvi- 


ously, length and response rate were not related. 
A methodological problem with Sletto’s study was 
that the 35 


“Page questionnaire was simply the 
10-page and 25-page questionnaires combined. If 
his subjects found the 10-page questionnaire more 
interesting than the 25-page questionnaire, this 
Increased interest could have boosted the response 
rate of the 35-page questionnaire over that of the 
25-page questionnaire, 

When we look at the original Parten (1950) 
report, we again find no evidence of a relation- 
ship between length and response rate, At first 
Parten simply reiterates the Sletto findings 
(strangely enough as support for her contention 


of a relationship between 1 Bs 
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discussing this issue. Truly, it is impressive that 
such a large percentage of original nonresponders 
should respond to a first follow-up contact, but in 
light of the previously mentioned considerations, 
we should at best draw guarded conclusions from 
the Stanton study. 

Norton (1930) studied response rate and used 
“items of information asked” instead of page 
length as a measure of questionnaire length. In 
several instances, his results showed that longer 
questionnaires had a higher response rate than 
shorter ones. However, because of Norton’s small 
sample size, we must exercise caution in drawing 
conclusions. Furthermore, by measuring ques- 
tionnaire length as merely items of information 
asked, several long questions might have appeared 
more tedious to his subjects than many shorter 
questions, thereby affecting his results. 

In a more recent study by Champion and 
Sear (1969), questionnaires of three, six, and nine 
pages were used. Significantly more nine-page 
questionnaires were returned than three-page 
questionnaires. From this they inferred, “It is 
evident that the issue of questionnaire length is 
more complex than is realized [p. 338]" A 
criticism of this study is that spacing and page 
format determined page length. Different subjects 
might have “seen” different lengths than others; 
consequently, this issue clouds the results of the 
study. 

The present study has tried to eliminate the 
methodological difficulties of previous studies in 
order to ascertain whether or not length of ques- 
tionnaire is related to response rate. Because the 
sample consisted of university professors, the 
subjects were chosen in such a way as to allow 
testing of the following two auxiliary hypotheses: 
(a) There is no relationship between status of 
professor (full, associate, or assistant) and the 
length of questionnaire to which they respond. 
(b) There is no relationship between the college 
of the professor (College of Liberal Arts, Insti- 
tute of Technology, and College of Agriculture) 
and the length of questionnaire to which they 
respond. These hypotheses were tested so that if 
a relationship between questionnaire length and 
response rate were found, it would be possible 
to know whether it was spurious. The study was 
not designed in the manner suggested by Robin 
(1965) to maximize return, lest a high response 
rate obscure otherwise significant differences. 


METHOD 
Sample 


A stratified random sampling technique was 
used, based on an examination of the University 
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of Minnesota staff directory to locate four de- 
partments within each of the three colleges that 
had at least three professors, three associate 
professors, and three assistant professors. By use 
of a random number table, 108 subjects were 
chosen so that there were three professors of 
each rank from each department. This resulted 
in 36 subjects from each of the three colleges, 36 
subjects from each of the three levels of pro- 
fessorship, and 9 subjects from each of the 12 
departments. It was necessary to stratify the sam- 
ple in this manner in order to allow testing of the 
auxiliary hypotheses. 


Questionnaire Design 


The questionnaires consisted of 40 questions 
dealing with current social problems. The ques- 
tions were all one or two lines long and phrased 
such that an answer of yes, no, or undecided 
would be appropriate. The questions were assigned 
randomly to four different pages (10 questions per 
page). Each page was headed: 

College Department. Position. Age. 

and had no page number, so that each of the four 
pages would appear normal if presented alone. 
The questionnaire pages were assembled into 
three lengths (one page, two pages, and four 
pages) of which there were 36 one-page question- 
naires (9 one-page questionnaires for each of the 
four-page possibilities), 36 two-page question- 
naires (6 two-page questionnaires for each of the 
six combinations of the four pages of questions), 
and 36 four-page questionnaires (each including 
all four pages). This method of questionnaire 
construction eliminated the contaminating factor 
of question interest apparent in the Sletto (1940) 
study. Furthermore, this determination of ques- 
tionnaire length is free from the criticisms of the 
Norton (1930) and the Champion and Sear 
(1969) studies. The questionnaires were ran- 
domly assigned to subjects so that each level of 
professorship in each department received one 
one-page questionnaire, one two-page question- 
naire, and one four-page questionnaire. This 
method of questionnaire distribution should have 
avoided possible bias due to differences between 
the personnel of various departments or to dif- 
ferences hetween the levels of professorship. 


Questionnaire Distribution 


The 108 questionnaires were distributed simul- 
taneously through the campus mail. Sent with 
each questionnaire was a self-addressed return 
envelope and a letter requesting participation in 
a research project. The subjects, therefore, had 
no information regarding the true purpose of the 
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TABLE 1 


NUMBER OF QUESTIONNAIRES RETURNED 
AND Not RETURNED 


ndis da No. returned | No. not returned 
page leng 
1 23 (64%) | 13675) 
2 20 (56%) 16 (44%) 
15 (42%) | 21 (58%) 
l 
Note. x? 


= 3.65 (not significant at .05 level); N = 108, 


study. The cutoff date 
which time, according t 
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questionnaires shoul 
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RESULTS 
Questionnaire Length and Response Rate 
Of the 108 questionnaires mailed, 58 (53.7%) 
Table 1 show that 
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there appears to be a negative correlation in the 
table between length an Tesponse rate, this 
relationship is not Statistically Significant, 
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of the subject was not related at a statistically 


significant level to the length of questionnaire to 
Which he responded. 


CONCLUSIONS 


This study failed to show a significant pc 
tion between questionnaire length and respon is 
rate. Also, the two auxiliary hypotheses, whic 
were basically checks of the sampling sg a 
failed to show significant correlations, The nor 
to find a significant relationship between leng 
and response rate further emphasizes the — 
tion of this paper—that investigators have bee 
too hasty in accepting the assumption of a Mie 
tive correlation between questionnaire length an 
response rate. Moreover, this study has been 
ree from some of the methodological problems 
that have clouded the findings of earlier studies. 
cause questionnaire length continues to be 


important in present-day research, continued 
research is called for, 
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PERCEPTION OF ABILITIES AS A DETERMINANT OF PERFORMANCE 


A. P. O'REILLY ! 


Agricultural Institute, Dublin, Ireland 


Data were collected on performance effectiveness and perceived possession of 
required abilities among white-collar workers in a public organization. Support 
was found for the hypothesis that high performers perceive themselves as hav- 
ing more or higher levels of job-required abilities than do poor performers. 


It has been suggested (Lawler, 1969) that a 
person will be motivated to perform well when he 
perceives his job as requiring him to use abilities 
that he values. This suggestion finds support 
from Kaufmann’s (1962) findings that level of 
performance could be increased by instructions 
to the effect that the task required abilities which 
the subjects thought they possessed and from 
the study described by Aronson and Carlsmith 
(1962) that indicated that people strive to per- 
form at a level which is consistent with their 
conceptions of their abilities. The study reported 
here attempts to investigate whether perception 
of required abilities is a determinant of perfor- 
mance. 


METHOD 


Data were collected on performance effective- 
ness and perceived possession of required abili- 
ties among 64 clerical workers in a public organi- 
zation. All workers carried out the same range of 
41 routine tasks and were randomly assigned to 
different supervisors. It was hypothesized that 
the more effective workers would have a higher 


1 Requests for reprints should be sent to A. P: 
O'Reilly, who is now at Research. and Planning, 
AnCO (Industrial Training Authority), Ballsbridge, 
Dublin 4, Ireland. 


perception of their job-required abilities than the 
less effective workers. 

Job performance of the subjects did not lend 
itself to objective measurement. Because the jobs 
were made up of many tasks for which skills/ 
knowledge required might be independent (Guion, 
1961; Ronan, 1963), the jobs were broken down 
into a comprehensive list of skills/knowledge 
required. Supervisors were asked to rate each of 
their subordinates and each subordinate task using 
a 4-point scale (Figure 1) on (a) subordinate's 
present skills/knowledge level for each task and 
(b) skills/knowledge required to perform each 
task effectively. A subordinate was regarded as 
not performing a task effectively when rating 
a< rating b. The subordinate's perceived pos- 
session of required abilities was examined by ask- 
ing each subordinate to rate (a) his present 
skills/knowledge level for each task and (b) the 
skills/knowledge level required for effective per- 
formance of each task, using the same 4-point 
scale. A subordinate was regarded as perceiving 
that he possessed the required skills/knowledge 
for any task where his rating a > rating b for that 
task. 

RESULTS 


Subordinates were ranked in order of effective- 
ness, the effectiveness measure being the number 


Column A—Position | Code | Column B—Employee 

Does not require employee to perform this task 0 | Has no understanding of this task 
Requires only a general understanding of this 1 | Hasa general understanding of this task, but 

task, without a requirement to perform | cannot perform 5 VASE, 
Requires performance of this task under direct | 2 | Can perform adequately under dire - yer- 

supervision vision > ct super 
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: su : n | perform fully w " 
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| » or can guide less skilled personnel 
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TABLE 1 


SUPERVISORS’ PERCEPTIONS OF SUBORDINATES 
ABILITIES FOR EFFECTIVE TASK 


PERFORMANCE 
Effective | Noneffective 
Hi performer | performer 
No. tasks for which skills/ 
knowledge level is as 
required for effective 
performance 315 152 
No. tasks for which skills/ 
knowledge level is higher 
than required 60 9 
Total 375 161 
Mw 22.06 13.42 


Nole. n = 17 for effective performer and 12 for noneffective 
performer, Mann Whitney U test — 26, p < .001, one-tailed. 


of tasks for which the supervisor rated the sub- 
Ordinate as having a level of skills/knowledge 
equal to or greater than that required for effec- 
tive performance. Only subordinates who re- 
ceived such ratings for at least half (21) the 
total number of tasks were regarded as effective 
for the purposes of this study. There were 17 
such subordinates, for whom the mean number of 
tasks for which they were assigned an effective 
rating was 22.06. Noneffective performers were 
regarded as those who received effective ratings 
fora Significantly smaller (p < .001) number of 
tasks. There were 12 such subordinates, for 
whom the mean number of tasks for which they 


were assigned an effective rating was 13.42 (Ta- 
ble 1). 


TABLE 2 


T 
SUBORDINATES PERCEPTIONS or THE 


IR. Possession 


OF REQUIRED ABILITIES 
eee -— 
—————— 
Ttem Effective Noneffective 
performer performer 
No. tasks for which skills/ —a. S 
knowledge level is as 
required for effective 
performance 274 " 
No. tasks for which Skills/ ins 
knowledge level is higher 
than required 147 22 
"Total 421 187 
M 24.77 15.59 
Note. Mann-Whitney U test = 39, # <.001,, one-tatieg 


Perceived possession of required abilities was 
compared between effective and noneffective per- 
formers, and the results are shown in Table 2. 
Effective workers perceived themselves as having 
abilities at, or higher than, the level required for 
effective performance for a mean number of 
24.77 tasks, compared to a figure of 15.59 tasks 
for noneffective workers (p < .001). 


CONCLUSIONS 


The findings seem to support the hypothesis 
that high performers perceive themselves as hav- 
ing more or higher levels of job-required abilities 
than do poor performers. In the light of the ex- 
pectancy theory advanced by Georgopoulos, Ma- 
honey, and Jones (1957), Vroom (1964), and 
others, these findings might suggest a reinterpre- 
tation of some earlier studies in which correla- 
tions were found between a person’s job satisfac- 
tion and the extent to which he viewed the job 
as requiring his abilities (Brophy, 1959; Korn- 
hauser, 1964; Vroom, 1962). 
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EFFECT OF WORK SAMPLE TEST UPON SELF-SELECTION 
AND TURNOVER OF JOB APPLICANTS + 


JAMES L. FARR, BRIAN S. O'LEARY, anp C. J. BARTLETT 


University of Maryland and 


American Institutes for Research, Washington, D.C. 


The hypotheses that job applicants who were administered a preemployment 
work sample test and who, consequently, had a more accurate expectancy about 
task requirements would have a higher job refusal rate and a lower voluntary 
turnover rate than applicants not administered the work sample test were 
examined with a sample of sewing machine operator applicants. Some support 
ior the hypotheses was found for white subjects but not for blacks. Racial 
differences were explained in terms of the differential importance of factors in 


the work situation. 


Several studies have found that preemployment 
expectations about the nature of a job which are 
not confirmed by job experience are related to 
subsequent job turnover (Katzell, 1968; Weitz, 
1956; Weitz & Nuckols, 1955). Only in Weitz 
(1936) was the accuracy of preemployment work 
expectancies experimentally varied. Weitz pre- 
sented one group of applicants for the job of life 
insurance agent a detailed booklet describing the 
job activities of an agent and the average amount 
of time spent in each activity. A matched control 
group received no such booklet. The termination 
rate for the control group was significantly higher 
than for the experimental group. 

The purpose of the present study was to 
examine the usefulness of a work sample test in 
providing applicants with accurate preemploy- 
ment information about a job. The relationship 
9f preemployment information and subsequent 
Work expectations to job behaviors was examined 
by a comparison of the voluntary, short-term 
turnover rates of applicants who were admin- 
istered the work sample test and those who were 
not. The hypothesis was that the group admin- 
istered the work sample test would have a lower 
Short-term, voluntary turnover rate. A corollary 
of this hypothesis was that the group exposed to 
the work sample test would also have à higher 
rate of job refusal than the group not so exposed. 
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METHOD 
Subjects 


The subjects were 160 female, inexperienced 
applicants for the job of sewing machine operator 
at a ladies’ apparel factory in the mid-Atlantic 
region of the United States. Included in the 
sample were 67 white and 93 black applicants. 
The white and black samples did not significantly 
differ in age or educational level. All job appli- 
cants for a six-month interval were included in 
the study. 


Procedure 


Applicants were randomly assigned to one of 
three groups, 40 each to Groups A and B and 80 
to Group C. Group A was not administered any 
tests prior to employment. Applicants in Group B 
were administered two locally developed appa- 
ratus tests (a pinboard and a formboard). Group 
B was included in the experimental design in 
order to assess the effect of preemployment test- 
ing, per se, upon job acceptance and turnover. 
Group C applicants were administered the two 
apparatus tests and a work sample test prior to 
employment. The work sample test required 
about two hours to complete and was composed 
of items which required the applicant to handle 
pieces of fabric and to thread and actually op- 
erate the sewing machine. It should be noted that 
preemployment testing times were not equated 
for Groups B and C. 

All applicants in the three groups were offered 
employment, regardless of scores on any test. 
Since the subjects were assigned randomly to the 
groups, slightly differing proportions of white and 
black applicants comprised the three groups. 
Group A was composed of 16 white and 24 black 
workers; Group B had 15 white and 25 black 
workers; and Group C contained 36 white and 44 
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black workers. All applicants who accepted em- 
ployment were placed into the sewing machine 
operator training program. All data analyses 
examined each racial group separately. 


Criterion Measure 


The turnover criterion measure was developed 
by classifying all subjects in the experimental 
groups into one of five categories: remaining on 
the job; terminated due to lack of progress; 
voluntary turnover ; involuntary quit; and refused 
employment. The involuntary quit category in- 
cluded those workers who had to quit for such 
reasons as family’s moving from the area, sick- 
ness in the family, etc. The voluntary turnover 
category was composed of workers who had vol- 
untarily quit or had been terminated for absen- 
teeism. Workers were terminated for absenteeism 
by the company only when it was apparent that 
they did not intend to return to the job. Thus, 
these workers had also voluntarily withdrawn 
from the organization. 

Voluntary turnover was measured at three 
time intervals: two, four, and six weeks after 
employment. These time periods were used since 


the purpose of the study was to look at short- 
term turnover, and multip| 


; le-criterion measure- 
ments were desired. 


RESULTS 

Racial subgroup differences were noted in the 
data. The proportion of black applicants refusing 
the offered employment did not differ substan- 
tially among the experimental groups, In Group 
A, 5 of the 24 black applicants (21%) refused 
employment; in Group B, 4 of the 25 black appli- 
cants (16%); and in Group C, 11 of the 44 black 
applicants (25%). The data for the white appli- 


were more consistent 
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= 33.3% for Group B, 8.4% for Group Cat four 
weeks; 40.0% for Group B, 11.1% for Group C 
at six weeks). Differences between Groups A and 
C were in the predicted direction for four and six 
weeks and approached significance (voluntary 
turnover = 25.0% for Group A at four weeks, 
p<.14; 31.2% for Group A at six weeks, 
? <.11). No significant differences were observed 
among voluntary turnover rates for the groups 
at the two-week period. No support for the 
hypothesis was found with the black subgroup. 
Voluntary turnover rates were essentially equal 
for all experimental groups (e.g, at six weeks 
voluntary turnover = 20.9% for Group A, 16.0% 
for Group B, and 20.5% for Group C). All vol- 
untary turnover rates were computed on the basis 
of the number of applicants offered employment. 
The same pattern of results occurred if the num- 
ber of applicants accepting employment was used 
as the base for the computation of the rates. 


Discussion 


Racial differences with regard to support of the 
hypotheses were not anticipated. A purely post 
hoc explanation of these differences which may 
be offered is related to the factors affecting the 
motivation and job satisfaction of two subgroups. 
The white applicants and workers may have 
viewed the nature of the work and other related 
task variables as being the most important factors 
in the work situation. Thus, the opportunity to 
experience a realistic work simulation before 
employment may have provided accurate expecta- 
tions about important aspects of the job for the 
white applicants, The white applicants in those 
groups that were not administered the work 


Sample test would also have preemployment 
expectations about the 


job, but those expectations 
would probably be less accurate, Disconfirmed 
expectations would result ina tendency towar 
voluntary turnover, 

Black job applicants, on the other hand, may 
not have placed as much emphasis on task-related 
factors, Perhaps such factors in the work sit- 
uation as inter 
and co-worker 


e voluntary turnover rate 
among the black experimental 
information about these 
situation was provided bY 


ata presented by Bloom and Barry 
ound that extrinsic job factors wer 
nt than intrinsic factors in a sample 
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ative importance of extrinsic factors was not 
found with a white blue-collar sample. The expla- 
nation that is offered to account for the data of 
the present study is equivocal but appears to be 
reasonable. 

Other findings of the present study deserve 
some consideration. The effects of testing, per se, 
had little effect upon job acceptance or turnover 
behaviors. However, testing time was not equated 
for Groups B and C. Àn effect might have been 
found if both groups had been administered tests 
for the same length of time. 

No significant differences in turnover rates 
were observed among the experimental groups for 
the two-week period for the white subgroup. The 
nature of the training program may have caused 
this lack of significant results for the two-week 
period. Initially, each trainee learns very basic 
sewing skills, such as needle threading, cloth 
handling, etc., which do not involve actual sewing 
activities. After learning the basic skills, the 
trainee practices the actual sewing operation that 
she will later perform on the job. The initial 
basic skills portion of the training program may 
have been accurately judged by the workers as 
not indicative of the job situation, whereas the 
sewing operation practice was quite similar to the 
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actual job situation. Thus, interest in and ability 

to perform the basic skills may not have had as 

much effect upon worker turnover as the later 
sewing operation practice. 

The data presented here offer some support 
for the efficacy of a personnel selection approach 
that incorporates a method of presenting accurate 
information about the job to the applicant. 
Future research should be directed toward the 
understanding of the relationship between pre- 
employment variables, such as work expectancies, 
interest, and abilities, and postemployment vari- 
ables, such as motivation, satisfaction, and 
performance. 
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WORK OBSERVATION VERSUS RECALL IN DEVELOPING 
BEHAVIORAL EXAMPLES FOR RATING SCALES 


JAMES E. CAMPION, JACK GREENER, ann SAM WERNLI? 


University of Houston 


The retranslation method was used with supervisors from a national airline 
to develop rating forms. One group of supervisors employed the work 
observation method and another group used the recall method in collecting 
behavioral examples. The results indicated that the difference in method did 
not influence supervisor agreement in reclassifying behavioral examples with 
regard to (a) dimension of job performance illustrated or (b) level of 
performance illustrated. 


Smith and Kendall (1963) proposed the 
retranslation method as a new approach to 
the construction of rating scales. Their method 
has been recommended as one solution to the 
persistent problem of developing rating forms 
that are psychometrically sound and yet are 
accepted and accuratel 
untrained raters (Cam 
Weick, 1970; Dunnet 
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method of collecting behavioral examples for use 


in the retranslation approach to developing 
behavioral rating forms. 


METHOD 
Sample Characteristics 


The job described was that of a customer 
service agent? working for a national airline in 
their Denver and Houston terminals. Descriptions 
were collected from 32 supervisors of whom 19 
were located in Denver and 13 in Houston. 


Procedure 


With one exception, the procedures used were 
the 


same for Denver and Houston samples and 


were essentially the same as those employed by 
Smith and Kendall (1963). The exception was 
the method e 


mployed in collecting the behavioral 
he Houston group used work obser- 
t reas the Denver group used recall. 
Briefly summarized, the Procedure involved the 
following steps: 

l. Every manager responsible for supervising 
Customer service agents Participated in a con- 
ference where the need for improving their per- 
formance appraisal system was discussed and the 
steps necessary for improving the system were 
outlined. Next they were asked to describe the 
qualities which when Present to a high degree 
eerie an above-average agent and when present 
F4 bros degree describe a below-average agen 
Work behayion O'S Of customer service agen 

eha € defined as a result of ibis 
E accuracy—productivity, dependability, 


> Personal appearance, relations with 
nd safety, 


h second conference was conducted with 
each supervisor, 
ae, 


uring this session, after being 


examples, T 


others, a; 


? Department 


of i ation 
agent 912.368. Transportation, transport: 


VAT 


SHORT 


given feedback on the results of the first con- 
ference, participants were asked to provide three 
behavioral illustrations for each work dimension 
describing above-average performance, average 
performance, and below-average performance. 

All instructions to the Denver and Houston 
groups were identical except for those describing 
methods to be used for collecting behavioral 
examples. Here the Houston group was asked to 
use a work observation method of collecting 
behavioral examples. They were provided with 
forms to facilitate the recording task. These 
forms were constructed so that they could be 
conveniently carried in their shirt pocket. Par- 
ticipants were told to take about two weeks to 
collect all the examples, but they were allotted 
more time if they felt it necessary. 

The Denver group was asked to recall be- 
havioral examples to illustrate each work dimen- 
sion. They were told to think back over the time 
they had supervised customer service agents and 
to describe three illustrations for each work 
dimension. They were also provided with a set of 
forms to facilitate the recording of information. 

3. In total, 432 usable behavioral examples were 
obtained. Approximately 6065 of these had been 
generated in Denver using the recall method. 
These items were thoroughly shuffled and then 
alternately assigned into two lists (A and B). 
Each list included items that were generated by 
both the work observation and recall methods.* 

4. In the final stage, each supervisor was given 
either List A (Denver) or List B (Houston) and 
was asked to sort each example into the appro- 
priate work dimension and to assign it a value of 
from 1 (far below average) to 6 (far above 
average) which indicated level of performance 
illustrated on that work dimension. 


RESULTS 


'The returns were analyzed to answer two 
related questions. The first. question concerned 
the relative effectiveness of recall versus work 
Observation as methods of obtaining behavioral 
examples. A second question concerned the gen- 
eralizability of behavioral examples generated by 
these methods to a group of raters who had not 
been involved in generating them. The analysis 
employed a two-way analysis of variance with 
repeated measures on one factor The levels of 
one factor, the within-subjects factor, represent 
the two methods of generating behavioral 
examples (work observation vs. recall). The 


i The item distribution was 2$ follows: List A, 
35% work observation, 65% recall; List B, 45% 
Work observation, 55% recall. 
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levels of the second factor, the between-subjects 
factor, represent the two rater groups (Houston 
vs. Denver supervisors) who reclassified the be- 
havioral examples. Under these conditions, the 
within-subjects factor provided information on 
the relative effectiveness of the two methods of 
generating behavioral examples, while the inter- 
action term provided information on the gen- 
eralizability of the methods across different rater 
(supervisory) groups. The procedures followed 
were those described by Winer (1962) using the 
unweighted means solution for unequal Ns. 

Two aspects of the returns were analyzed in 
this manner. The first was rater’s agreement 
regarding which dimension was illustrated by each 
behavioral example. The proportion of behavioral 
examples correctly reclassified by each individual 
was the dependent variable. A reclassification was 
judged correct if it corresponded to the modal 
response of those raters who reclassified the 
behavioral example. The analysis of variance for 
these data indicated no significant effects asso- 
ciated with the method used in generating be- 
havioral examples (F = .10, df = 1/30, p > 
.05). In addition, the interaction term was not 
significant (F = 1.76, df = 1/30, p > .05). The 
proportion of behavioral examples correctly re- 
classified by the Denver supervisors was .79 for 
behavioral examples based on work observation 
and .78 for those based on recall. The proportions 
for the Houston group were .74 and .76, respec- 
tively. These proportions indicate that approx- 
imately three fourths of all behavioral examples 
were reliably reclassified. 

A second aspect of the data analyzed was the 
agreement among supervisors regarding level of 
performance illustrated by each behavioral 
example. For this analysis, only those behavioral 
examples which had been correctly reclassified 
were used. For each individual, a deviation score 
was computed for the behavioral examples he 
correctly reclassified. Each deviation score was 
based on the difference between the group mean 
for the behavioral example and the value assigned 
by the individual. An individual's average devi- 
ation score was used as the dependent variable. 
Again the analysis of variance indicated no sig- 
nificant effects associated with whether the recall 
or work observation method of generating be- 
havioral examples was used (F = 2.97, df = 
1/30, p > .05). Furthermore, the interaction was 
not significant (F = .51, df = 1/30, p > .05). 

5 The analysis was based on an arcsin transfor- 
mation of the data which Winer (1962) suggests may 
be a more appropriate scale when using data based 
on proportions. 
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Being unable to reject the null hypothesis in this 
case would seem to suggest that (a) the work 
observation and recall methods were equally 
effective in providing reliable behavioral examples 
and (b) the items were Perceived with equal 
precision whether they were generated in the 
Supervisors’ own work area or by a supervisory 
group from another geographical location, 

The average deviation for the Denver super- 
visors was .55 for behavioral ex: 
work observation and 
recall. The average devi 
Supervisors were .47 an 
differences observed be 
Houston groups 


(F = 4,71, df = 1/30, p < 03), indic. 
the Denver super 


significant 
ating that 


received different lists of be 
Consequently, it i 
in reclassification variability may be due to list 
differences, However, the behavioral examples 


had been randomly assigned to lists 
fore, this interpretatio, 


SHort Notes 


The main conclusion to be drawn from these 
findings is that recall and work observation pe 
to be of equal value as methods of gamen F 
behavioral examples, both in terms of e 
reliability and their generalizability across ra 
groups. 
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RESPONSE TIME IN THE FULL VISUAL FIELD * 


RICHARD F. HAINES KIRBY GILLILAND 


Neurosciences Branch California State University, San Jose 


Ames Research Center, National Aeronautics and 


Space Administration, Moffett Field, California 


Seven male volunteers were administered a binocular peripheral visual response 
time task to determine response time to small (45’ arc), white, flashed, pho- 
topic stimuli. These stimuli were located 10° arc apart from 10° arc to 90° 
arc from the fovea along each of cight retinal meridians, each 45° arc apart 
around the 360°. Testing occurred approximately every fourth day throughout 
a three-month-long bedrest investigation. The results showed that the retina 
possesses relatively concentric regions almost twice as wide as high within each 
of which mean response time can be expected to be of equal duration. These 
findings are related to previous response time research. Two examples are given 
of how these data may be applied to the design of an aircraft instrument 


panel and cockpit window. 


An important criterion in the design of 
instrumental panels is the proper placement of 
emergency warning indicators and controls so 
as to minimize the speed of response. The 
large number of investigations on visual re- 
sponse time which have been reported pro- 
vide useful information in this regard. Never- 
theless, most previous research has been 
confined to stimuli imaged along the horizon- 
tal retinal meridian? (cf. Rains, 1963; Zahn 
& Haines, 1971, for reviews of this subject). 

Several investigators have quantified re- 
sponse time to stimuli imaged along meridians 
other than (and including) the horizontal. In 
Kobrick's (1965) earliest response time study, 
he developed a response time perimeter that 
Dresented 32 stimulus positions. Four angular 
Separations from the line of sight (hereafter 
called 9) were investigated: 12^, 38^, 64", 
and 90° arc along each of the following eight 
Meridians: 60°, 90°, 120°, 170°, 240°, 270°, 
300°, and 350° arc. Hereafter, the symbol $ 
Will be used to designate the frontal plane 
(i.e, normal to the line of sight) meridianal 
angle as measured from the vertical (0°) and 
Progressing in the clockwise direction. His 
findings are of particular relevance to the 


1 Requests for reprints should be sent to Richard 
Haines, Neurosciences Branch, Ames Research 
Center, National Aeronautics and Space Administra- 
tion, Moffett Field, California 94035. . 
_ * A retinal meridian is the projection of a straight 
ìne in the frontal plane that passes through the 
Senter of the fovea. 


present investigation because he presented 
them in the form of a polar coordinate plot, 
the center of which represented the foveal 
fixation point. A horizontally oriented, ap- 
proximately symmetrical region was presented 
which was the area of “unaffected intentional 
response times.” It extended from 90° left 
« 0 « 90? right horizontally, about 38° arc 
above and 64? arc below fixation. This bound- 
ary indicated that stimuli imaged inside it 
produced response times that were not signifi- 
cantly different from each other, while stim- 
uli imaged outside it produced response times 
that were significantly longer. In later studies 
(Kobrick, 1971, 1972; Kobrick & Appleton, 
1971; Kobrick & Dusek, 1970), the number 
of $ values was increased from 8 to 12 in 30° 
arc increments while the same @ positions were 
investigated as listed previously. In a study by 
Haines (1968), binocular detection time for 
27 stimulus positions was quantified. The sub- 
jects were run in nine separate groups, each 
of which was presented only 3 stimulus posi- 
tions. All stimuli were imaged within a circle 
of about 45? arc radius. Mean response time 
was found to be minimal (492 milliseconds) 
for stimuli imaged within a ring which ex- 
tended from 10° to 20° arc radii from fixa- 
tion. Payne (1966) investigated response 
time for 24 stimulus positions (from 30° arc 
on one side to 30? arc on the other side of 
the fovea) along each of eight meridians 
($ = 80°, 90°, 100°, 135°, 260°, 270°, 280°, 
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and 315°). In another investigation, Payne 
(1967) held @ constant at 15° arc imaging 
the stimuli in a circle around the fovea in 
31 locations, each 10° arc apart. Mean re- 
sponse time was found to be maximum (268 
milliseconds) at ¢ = 210° and minimal (244 
milliseconds) at ¢ = 90°. 

It is apparent from this review that rela- 
tively large retinal areas still have not been 
tested for their ability to mediate response 
time. This is particularly unfortunate since 
various engineering designers require response 
time data for the full visual field. Such data 
could also be of value in obtaining a better 
understanding of the relationship between pe- 
ripheral visual response time and other known 
response characteristics obtained over the 
retina. Therefore, the primary objective of the 
present investigation was to compare response 
time at many retinal positions along many 
meridians to develop a graphic representation 
of response time in the full visual field. 


METHOD 
Design and Procedure 
This investigation cai 
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the number of entrances into the subject's ee 
reposition the apparatus. A 5-minute break i EE d 
between the presentation of the first two ee ee 
two meridians. Two meridians required a o werd 
minutes to administer. All stimulus positions 

esented randomly. m 
Pto keep the subjects from learning when ne 
lus was going to appear, the interstimulus i Mer 
was varied randomly from 1.8 to 4 seconds Hae 
steps (M — 2.4 seconds). Use of these 1n 
yielded 25 trials per minute. th 
d The subject abes by pressing a fun end; 
the thumb of his right hand. If he did not resp aed 
the interstimulus interval for that trial was reco! well 
and labeled as a no response. A no response E ee 
as responses that occurred before the stimu caly. 
peared (anticipation responses) were automa 
excluded from these analyses. É 

Each subject wore red (Kodak No. 29 men 
goggles for at least 5 minutes before bapan is nong 
minute adaptation to total darkness. An aler! ject 
was sounded just before testing began. The e time 
remained in the supine position for all response 
testing and kept both eyes open. 


Apparatus 


+) else 
The apparatus has been described in detail = 
Where (Haines, 1973). Briefly, a research xe aul 
time perimeter was used which presented 18 2 one 
across the subject’s field of view from 90° arc peu 
extreme to 90° arc on the other in 10° arc " two 
ments. When positioned to test the vertical Lu eter’s 
of the four 45° arc oblique meridians, the ie sub- 
far peripheral stimuli were imaged outside th posi- 
ject’s field of view, In order to eliminate “false posi- 
tive” response time data obtained from at d 
tions (see footnote 4), each subject’s binocule™ ink 
of view limits were measured using the p Jumi- 
Perimetric technique. A I-millimeter diamel - 
nous source was moved slowly along each a 
toward and (away) from the fovea until pe 
ject said he perceived it (or no longer percelv 


These mean field of view limits are based upon 
trials in each direction, 
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RESPONSE TIME IN THE FULL VISUAL FIELD 


Each light cone had a diameter of about 15 centi- 
meters at the eye location which allowed the subject 
to move his head slightly during testing without 
changing the stimulus’ luminance appreciably. 

The source for all stimuli was a single, fluorescent 
flash lamp with 1-microsecond rise time. It remained 
on for 50 milliseconds during each trial. The lumi- 
nance of all stimuli except the 0 — 0? fixation source 
was adjusted to 3.30 X 10? candelas per centimeter 
squared (.09 footlamberts). The fixation source’s 
luminance was constant at 3.15 X 10 candelas per 
centimeter squared. 

In order to help control for circadian effects, each 
subject was tested at approximately the same time 
of day. 


Subjects 


Seven male volunteers took part. Their ages ranged 
Írom 19 to 22 years old (M — 20.4 years). All pos- 
sessed full and normal visual field sensitivity, 20:20 
near and distance acuity or better, and none had 
color defects or other visual dysfunctions that may 
have impaired their responses on this test. All were 

paid for their services and were highly motivated 
Piugligut the prolonged study. 


RESULTS 


The mean response time results are pre- 
sented in Figure 1. Test stimulus position (in 
degrees from the line of sight) is given on the 
abscissa, while mean response time is given on 
the ordinate. The curves shown for each 
meridian have been fit by eye. Each meridian 
is labeled by its respective $ angle as well as 


by the following letter designations: T — 
top UR — upper right, R = right, LR = 
lower right, B — bottom, LL = lower left, 


= left, and UL = upper left. A horizontal 
reference line is provided for each set of 
meridians along with a vertical response time 
measurement unit which can be used to de- 
termine mean response time at any stimulus 
Position. The vertical bar extending from each 
data point is plus one standard deviation. The 
limit of the binocular field of view for these 
Subjects is shown by the small vertical line 
drawn through each curve. 

The individual subject’s data upon which 
Figure 1 was plotted show differences in (a) 
the range of response times across these stimu- 
lus positions, (b) the form of the curve fit 
lor Corresponding meridians, (c) the size of 

he standard deviations at corresponding stim- 
us positions and meridians, and (d) the 
field of view limits on corresponding meridi- 
ans. Tt is apparent from Figure 1 that the size 
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Ficure 1. Grand mean results. (Abbreviations: 
T = top, UR = upper right, R= right, LR = lower 
right, B — bottom, LL —lower leít, L — left, and 


UL = upper left. 


of the standard deviation tends to increase 
slightly at larger values of 6. 

'The mean data were also subjected to two 
linear least squares fits per meridian: 10? < 
0 < 50°, 60? < A < 90°. From these curves, 
the grand mean data were replotted as equal 
response time regions within the visual field. 
These regions are presented in Figure 2, Each 
boundary indicates the region within which 
mean response time can be expected to be the 
same. The heavy dashed line shows the bi- 
nocular and the heavy solid line shows the 
outer limit of the subject's monocular field of 
view, according to Fulton (1955). Perimetry 
of our subjects’ visual fields were within +2° 
of this boundary. 

These data were also subjected to an analy- 
sis of variance. Table 1 presents these results. 
Since the session main effect was not signifi- 
cant and had to do with other parameters in- 
volved in the bedrest study which are f 
cussed elsewhere (Greenleaf et al., 1973). 
will not be discussed further. As indicated d 
the degrees of freedom associated with the 
various sources of error variance, this analysis 
was performed only on those response time 
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The present data provide a means of "P 
timizing the location of small eee 
emergency controls and displays anyW iew 
within the subject’s binocular field of we 
With regard to response time, Kennedy; uch 

annsen, and Devoe (1952), suggest that ithin 
controls and displays should be located Y Tine 
a 30° arc radius circle from the norma es 
of sight, However, the present data sugs îs 
that a different effective retinal shapa d 
more appropriate to mediate rapid om a 
responses. The present data have sowa te 
the shape is generally a horizontally en 
oval, as shown in Figure 2, If a single m the 
warning indicator js to be used to lett 
pberator to the onset of a malfunction ] 
he must Subsequently identify and respon’ 


e 
the master Warning indicator should be ws 
as Close to the normal line of sight as pe igh 
While at the Same time insuring that 
Contrast js 


d Ae aie Sereda 
Maintained with its imm 
background. 


where 
Although the reader is referred ^ < 
l à more complete treatment of the e- 
plex 
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-X Subject of istrument panel Jayo" 949i 
wi (Kennedy et al, 1952: Stellar, follow” 
Wulfeck, Weisz, & Raben, 1958), the 1° gata 
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Ficure 3. Application of Figure 2 data to warning-light placement on a 
representative aircraft instrument panel. 


can be used to help optimize the location of 
luminous warning indicators and emergency 
controls so that they will be likely to elicit 
the most rapid response. Figure 3 shows a 
mock-up of the pilot’s side of a representative 
aircraft instrument panel. Superimposed upon 
this illustration at the same angular scale are 
isoresponse time regions taken from Figure 2. 
The luminance level of the present peripheral 
Stimuli were higher than the level below which 
visual performance rapidly deteriorates (Rock, 
1953). 

Since pilots scan various instruments dur- 
ing flight (Fitts, Jones, & Milton, 1962; Mil- 
ton & Wolfe, 1952; Watts & Wiltshire, 1955), 
the different isoresponse time regions will also 
shift with each new fixation. Fixation has 
been placed upon the panel’s map display for 
the present purposes of discussion. Since the 
fastest response times can be expected to oc- 
cur (o stimuli imaged within a horizontally 
Oriented region that is almost twice as wide 
as high, small warning lights should, when- 
ever possible, be placed to the left or right of 
the basic instruments rather than above or 
below them to help optimize response time to 
their onset. 


The following illustration shows how the 
present data may also be related to the design 
of aircraft cockpit windows to the extent that 
operational safety standards and structural 
requirements are met. Figure 4 shows the 
window outline (light grey) proposed for 
future high-performance jet aircraft. The iso- 
response time regions of Figure 2 have been 
superimposed upon it and have been centered 
at the reference eye position. The forward 
window lies generally within the 300 milli- 
seconds isoresponse time boundary while the 
left window's retinal image falls upon regions 
of longer response times. This analysis sug- 
gests that the vertical structural members on 
each side of the forward window should be 
located at least 30? arc to the right and left 
of the line of sight to help maximize the 
amount of field of view that possesses the 
fastest response time. Other applications of 
these data are left to the reader. 

The present data should be applied to ac- 
tual design situations with the understanding 
that these response times were obtained under 
almost ideal viewing conditions. The subject 
was dark adapted and relaxed, was looking 
steadily in one direction, and was expecting 
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tion position, the present response time con- 
tour that matches his top and bottom boun- 
daries closest is the 340-millisecond contour 
(38° arc above, 53° arc below, 78° arc left, 
and 90° + arc right of the fixation position). 
The group mean response time Kobrick found 
Within this area ranged from 300 to 400 milli- 
seconds, which is not in conflict with the pres- 
ent findings. 


SUMMARY 


Peripheral visual response time was mea- 
sured at 72 locations in the full visual field 
and was found to exhibit relatively concen- 
tric regions of the retina within which mean 
response time may be expected to be of equal 
duration. These data can be used by design 
engineers in a variety of disciplines and in 
many ways, for example, to help locate visual 
warning indicators and other controls so as 
to optimize response time. 
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Following the original work of Deutsch 
(1949), a significant body of research h 
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Two hypotheses are tested: 


1. A performance-based differential-reward 
System will lead to greater individual produc- 
tivity than a performance-based equal-reward 
System in a production task. 

2. Individual productivity will be greater 
when the task structure involves less inter- 
dependence between group members. 


The rationale for the first hypothesis has 
already been stated implicitly. If there are 
few opportunities for blocking behavior, a 
financial incentive should increase individual 
effort to maximize reward. The second hy- 
pothesis is based on the premise that opportu- 
nity for blocking the productivity of others 
is less under low interdependence than under 
high interdependence. Also, subjects under the 
low-interdependence conditions should have 
greater opportunity to set the pace of 
their work than those under the high-inter- 
dependence conditions. 

The second purpose of this paper is the 
explication of a research paradigm which in- 
corporates both individual differences and ex- 
perimental manipulation within the same de- 
sign. The combined differential-experimental 
analysis is expected to account for signifi- 
cantly more variance in productivity than 
either the differential or experimental analy- 
sis performed singly. In addition, this design 
permits the testing of differential validity and 
utility across experimental conditions. 


METHOD 

Subjects 

Seventy-two male students taking undergraduate 
Psychology courses at Carnegie-Mellon University 
served as subjects. The subjects were solicited to help 
code questionnaires for a research project that the 
experimenter was conducting. It was stated that they 
Would be paid according to their performance and 
that expected earnings would average $2.00 for one 
hour of work. The subjects who volunteered were 
randomly gned to the experimental conditions. 


Design 


The experimental design consisted of two reward 
and two task-flow-interdependence conditions. The 
reward conditions were differential reward and equal 
reward, The task-flow conditions were high- and 
Ow-task-flow interdependence. Six three-man groups 
Were run in each cell of a 2 X 2 hierarchical design. 

12-item postexperimental questionnaire was devel- 
Ped to test (a) subjects’ perceptions of the experi- 
menta] manipulation and (5) satisfaction with their 


performance. The numerical section of the Minnesota 
Clerical Test was administered prior to each session. 

A data analytic procedure developed by Cohen 
(1968) makes it possible to analyze both individual 
differences and experimental conditions within the 
same research paradigm. This is accomplished by 
treating individual differences as continuous variables 
and experimental conditions as dummy variables. The 
F ratios obtained for the independent variables, 
including interactions, are equivalent to the F ratios 
in the analysis of variance. 


Procedure 


After the Minnesota Clerical Test was completed, 
instructions in the task procedure and the experi- 
mental manipulations were given. The task was to 
code responses to a questionnaire on standardized 
precoded forms. Each questionnaire was a set of 
three different problems with the response format 
being identical for each problem. Questionnaires were 
the same for all groups. Several practice sets were 
given to acquaint the subjects with the coding 
task. This was followed by one of two task-flow 
manipulations. 


Low-task-flow interdependence: 

Each of you will work on the stack of question- 
naires in front of you. There are fifty sets in each 
stack and you will code all three problems in 
each set. 


High-task-flow interdependence: 

Each of you will work on only one problem in 
each set. X will code the first problem and then 
pass the set to Y who will code the second prob- 
lem; then Y will pass the set to Z who will code 
the last problem. In order that the people working 
on the second and third problems do not have to 
wait for the first person to complete his task, I'll 
give each of you one set to begin coding the 
assigned problems while you await the problem 
set of the person feeding you. 


The reward manipulation was given next. All groups 
were told: 


Your earnings will be a function of both your 
group and individual performance. We are paying 
$.06 for each set coded correctly by the group. As 
an example, let's assume that this group codes 100 
seis correctly; that would mean that the group 
earnings will total $6.00. These earnings will be 
split in the following way. 


One of the following reward manipulations was then 
given. 
Differential reward: 

The person who codes the most number of cor- 
rect sets (problems) will receive one half of the 
group’s earnings or $3.00. The person who codes 
the next most correct sets (problems) will receive 
one third of the group’s earnings or $2.00. Finally, 
the person who codes the least correct sets 
(problems) will receive one sixth of the group's 
earnings or $1.00. 
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TABLE 1 


ANALYsIs OF VARIANCE RESULTS FOR Task-Frow- 
INTERDEPENDENCE AND REWARD CONDITIONS 


Source df | MS F b 

Between groups (G) 23 

Task-flow inter- 

dependence (T) 1|2211.12| 488 | <.05 

Reward (R) 1 | 1994.69! 441 | <.05 

TXR 1 55.50] «1 

G/TR 20 | 452.68] 2.61 | <.01 
Within groups 48 

Position (P) 2| 313.03} 1.80 | <.20 

TOP 2| 10878| «1 

RXP 2 49.94] «1 

TXRXP 2 $534| «1 

S/GP/TR 40 | 173.48 
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220) with reward and task-flow conditions as 
between-group variables and seating os 
as a within-group variable. The analysis o 
variance results are reported in Table r 
Groups/conditions was significant (p < .01); 
thus, this was used as the error term td 
calculate F ratios of between-group variables. 
Both hypotheses were supported. J The 
subjects coded more problems (p < .05) in 
the low-task-flow-interdependence than in 
the high-task-flow interdependence conditions. 
They also coded more problems (p < 05) HR 
the differential-reward condition than in the 
equal-reward condition. The differential- 
reward condition also produced less satisfac- 
tion with performance (p < .05) than oy 
equal-reward condition. There was no signifi- 
cant main or interaction effect found for any 
of the remaining postexperimental factors. , 
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nificantly related to total problems coded 
(r = .33). However, the effects of the Minne- 
sota Clerical Test and total problems coded 
within each treatment condition offer some 
support for the existence of differential valid- 
ity. The subjects in the combined equal- 
reward conditions are more predictable than 
those in the combined differential-reward con- 
ditions (p < .10). The differential validity of 
the Minnesota Clerical Test was even more 
pronounced for subjects in the equal-reward, 
low-task-flow-interdependence condition com- 
pared to those in the equal-reward, high- 
task-flow-interdependence condition ($ < .05). 
These differences did not exist under dií- 
ferential reward. 


DISCUSSION 


The experimental findings support competi- 
tion theorists who claim that differential 
rewards yield greater productivity. Although 
only two of the reward conditions used by 
Miller and Hamblin (1963) were included in 
this study, the results show that differential 
rewarding increases rather than decreases pro- 
ductivity. One major difference between the 
two studies is the incorporation of a group 
goal in both interdependence conditions. For 
every problem coded, each group member is 
benefited. Thus, it was in the best interest of 
every person to perform well This mixed- 
motive situation is similiar to a combined 
Diece-rate and group-incentive payment in 
industry. This suggests that a group incentive 
may curtail blocking behavior that might 
otherwise occur with an individual-incentive 
plan, 

Another reason for the difference in results 
may be due to the opportunity for blocking 
behavior. Miller and Hamblin (1963) used 
à problem-solving task that encouraged the 
exchange of information under high inter- 
dependence. Since subjects were not in face- 
to-face interaction, blocking could occur with- 
Out a threat of overt hostility. The present 
Study used a production task which required 
no exchange of information. Since the subjects 
Were in face-to-face contact, blocking behav- 
lor would probably have evoked an overt 
interference with other group members, hence, 
Provoking antagonism of others. 


TABLE 2 


CORRELATIONS OF MINNESOTA CLERICAL TEST WITH 
PRODUCTIVITY MEASURES UNDER REWARD AND 
TAsk-FLOW-INTERDEPENDENCE CONDITIONS 


Task-flow interdependence 


| 
Reward | = m 
RI | 

| Low| » | High | n | Combined | n 

= | SES ap reg | 
Differential |.24 | 18 | .16 | 18 | os |36 
Equal .67* | 18 | —.23 | 18 t0* 36 
Combined — | .47* | 36 09 | 36 33* 72 


* p <.01, two-tailed test, 


Differences in productivity between high- 
and low-task-flow-interdependence groups was 
not a result of differential blocking behavior, 
since overt blocking was negligible across all 
conditions. A more likely reason for this dif- 
ference is that the potential productivity of 
some subjects may have been constrained in 
the high-task-flow-interdependence conditions. 
This cannot account for the entire difference, 
however, since persons in Position 1 also dif- 
fered consistently, despite the fact that their 
productivity potential was not constrained in 
either condition. 

Lower satisfaction with performance in the 
competitive condition is consistent with the 
findings of Deutsch (1949). It is also worth 
noting that satisfaction and productivity were 
unrelated (7 = .07). 

A limitation of the traditional personnel- 
differential approach to selection is its lack of 
accountability for mean differences in produc- 
tivity across conditions. Typically, the re- 
searcher will develop a selection strategy for 
a job under a given set of conditions. The 
fact that a different set of conditions may 
yield a higher productivity is usually ignored. 
The experimentally oriented researcher is well 
aware of these mean differences, but typically 
ignores the gains in productivity which could 
be made through an assessment of differential 
ability. In the present study, the combined 
differential-experimental analysis accounted 
for twice as much variance as the experi- 
mental analysis and three times that of the 
differential analysis.* 


+The concept of a combined differential-experi- 
mental research strategy was proposed by Cronbach 
(1957). Recent empirical work conducted by 
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interesting implications with respect to the 
Equal Employment Opportunity Commission 
(EEOC) guidelines on employee hiring and 
Promotion procedures, which require the tester 
to determine that a significant difference be- 
tween conditions is absent when “protected 
groups” are involved (EEOC, 1970). Con- 
sider, for example, the coding task used in 
the present experiment where the function of 
the subject is to simply transfer responses 
from a questionnaire to a coding form, Based 
on face validity and previous validation re- 
search, it would appear that the Minnesota 
Clerical Test ought to predict coding behavior 
for a variety of work conditions. Yet; the 
results of this research indicate that the valid- 
ity of the relationship may vary significantly 
between reward and task-flow-interdependence 
conditions and that a tester may be violating 
EEOC guidelines by not testing the moder- 
ating effects of the situation. The tester may 
have no way of knowing, a priori, if a dif- 
ferential relationship exists between working 
conditions. 

The economic implications of a combined 
experimental-differential research strategy be- 
come apparent through an analysis of differ- 
ential utility, Utility of selection may vary 
greatly under differing work conditions, thus 
affecting the decision of whether or not testing 
would be profitable, In order to analyze differ- 
ential utility, the authors provide an illustra- 
lion using the data collected in the present 
experiment. Utility curves for each experimen- 
tal condition were derived from the Naylor- 
Shine tables (Naylor & Shine, 1968). Mean 
standard utility scores for total problems 
coded were calculated for selection ratios 
diene 159. In order to make (he curves 
directly comparable, al] productivity scores 
were Standardized On the productivity dis- 
orn n direi rer, htt 
condition dence condition. Thus, x15 
eap constitutes a baseline from which a 

Dcremental gains in other conditio 
due to the use of the Minnesota Clerical Test- 
this — of testing is not incorporated "' 
ditio ysis, since it is invariant across hë 

ns for each selection ratio in which t 
number pf applicants is constant. In order t° 
assess the effect of the cost of testing, 07€ 
should lower each curve by the same constan 
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It is clear from Figure 1 that the differential- 
reward, low-task-flow-interdependence groups 
with testing yield the greatest gains in pro- 
ductivity up to a selection ratio of .22. For 
selection ratios above .22, the productivity 
gained in the equal-reward, low-task-flow- 
interdependence condition with testing ex- 
ceeded that of any other condition. Thus, if 
one were to select which condition to adopt, 
the optimal decision should be based on the 
intersect of these two curves. Subsequently, if 
the cost of testing were large enough to lower 
part of the differential-reward, low-task-flow- 
interdependence curve below the baseline 
(no testing), a second intersect would be 
established, below which testing would yield 
disutility. 

The bottom two curves suggest the extent 
of disutility if the overall Minnesota Clerical 
Test-productivity relationship had been ap- 
plied to these conditions. 

These analyses should be accepted with 
caution since the sample size is small and, 
although the subjects were placed in a work 
situation, the experimental conditions remain, 
to a large degree, short-run simulations rather 
than actual job situations. Nevertheless, they 
should convince researchers and practitioners 
to consider situational factors when calcu- 
lating utility. To ignore them may be costly, 
especially when the selection ratio is low and 
the cost of testing is high. Where utility is 
negative for any situation, then another valid 
test should be found; if not, à random selec- 
tion may be the best available alternative. 


CONCLUSIONS 


The findings support the hypotheses that 
competitive reward conditions yield greater 
productivity than equal-reward conditions 
and that low-task-flow interdependence yields 
higher productivity than high-task-flow inter- 
dependence. A combined differential-experi- 
mental design accounted for considerably 
more variance than either the differential or 
experimental design alone. The Minnesota 
Clerical Test numerical scale was differ- 
entially related to productivity depending 
on the reward and task-flow-interdependence 
conditions. 
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ity, turnover, absenteeism, and injuries. A 
second study was designed to test a hypothe- 
sis developed on the basis of the obtained rela- 
tionships. 


Stupy 1 
Job Studied 


The data were collected in conjunction 
with a larger study designed to identify spe- 
cific behaviors, characteristics, and attitudes 
that differentiate the effective from the in- 
effective full-time independent pulpwood pro- 
ducer in the southern United States (Ronan 
& Latham, 1969, 1970). Pulpwood producers 
are small, independent businessmen who har- 
vest timber and sell the product to the paper 
industry. They employ a crew generally rang- 
ing in size from two to eight men. Their 
operations vary in level of mechanization 
from chain saws and a truck to highly so- 
phisticated equipment that cuts, delimbs, and 
saws a tree into appropriate lengths in one 
continuous process. 


Method 


A questionnaire was developed to relate the pro- 
ducer’s supervisory practices, his attitudes toward 
his employees, and various demographic variables to 
four criteria: production, turnover, absentecism, and 
injuries. Productivity was defined as cords of wood 
harvested during a 40-hour sample week and esti- 
mated annual production of all forest products. Two 
lypes of turnover were measured: number of men 
Who quit and number of men who were fired in the 
last year. Absenteeism was defined by the number 
of days one or more men were missing from work 
in the one-year study period. Injury was defined as 
the number of men who had been hurt to the extent 
that they had missed one or more days of work 
during that year, All criteria were indexed by number 
of employees in each crew in order to facilitate 
Comparisons between producer operations. 

Data relevant to the present study include two 
items on the questionnaire, namely, “Do you tell your 
men that some set production is needed for the day 
9r week?" and “The number of hours the producer 
is on the job with his employees.” The latter variable 
Served as a measure of the presence of supervision 
for this study, Measures of other supervisory prac- 
tices included questions relating to the following: 
‘raining, giving instructions and explanations, the 
Presence of a key man (e£. straw boss), whether 
the producer had military experience and had thus 
received first-hand experience in the importance of 
Supervision, and whether he used the same method 
of Payment (e.g., piece rate versus hourly) for all 
his employees or varied the method depending upon 
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TABLE 1 
ROTATED Factor LOADINGS 
Factor 
Variable 
18 2» 3 i 

Set goals? A3 | 31 | —.58 
Gives instructions a 
Key man ` —.23 | 38 .19 | 
Employees fired! ERI 
Employees quit! 08 
Injury rate =g | 
Equipment plans 20 
Provide training 19 
No. men hired .90 
Military experience A8 
Production quality 

(rating) .23 .39 
Method of payment 19 .69 | .59 
Annual production? 59 37 
Production hours worked 

by producer? 46 42] .70 
Weekly production? 0 53 
Mechanized operation AS 22) 45 


i te 
the above variables. Only loadings of .17 or greater 
wn. 

* An employee-production-centered style of supervision that 
included goal setting. 

b A production-centered style of superv 
, An employee-centered style of supervision that did not 
include goal setting. 

3 Variable of primary interest to this study. 


the different job tasks (e.g., cutting timber vs. driv- 
ing a truck). 

The questionnaire was administered to 292 pro- 
ducers who were randomly selected from a stratified 
geographical population. In brief, the wood-produc- 
ing population was stratified on the basis of where 
they delivered their product, the size of their an- 
nual production, the length of time on a given 
tract, and their primary business interest. The 
method and rationale for the sample selection has 
been described in detail in a previous publication 
(American Pulpwood Association, 1968). 


Results 


The data were subjected to a factor analy- 
sis using the method of principal components 
with a varimax rotation and multiple Rs as 
diagonal values. The three factors of impor- 
tance to this study, as extracted from the 
intercorrelation matrix, are shown in Table 1.* 

Factor 1 was interpreted as a production- 
employee supervisory dimension. It identified 


8 The intercorrelation matrix is not shown due to 
space limitations. A copy will be made available on 
request to the second author. 
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producers who set production goals and re- 
main on the job site to oversee their employ- 
ees. Measurement of the presence of super- 
vision—the number of hours the producer 
stayed with his men—was supported by such 
measures as a positive response to questions 
related to giving the men instructions and 
explanations, providing training, using varied 
methods of employee payment, and having 
military experience and as a negative response 
to the question, “Do you have a key man in 
your operation?” (the producers saw them- 
selves in this role). These behaviors corre- 
lated positively with productivity and nega- 
tively with injury rate. 

Factor 2 was interp 


reted as a supervisory 
Style that is 


production centered only. It 
identified moderately mechanized producers 
Who set production goals only, Working with 
employees did not load on this factor. An in- 
dication of poor supervision was provided by 
a positive response regarding the belief ofa 
key man in the Crew, thus indicating that 
these producers were not the primary source 
of leadership. These behaviors correlated 


Positively with compulsory and voluntary em- 
ployee termination, d 


Factor 3 was simply interpreted as ineffec- 
; aS defined b 


productivity 
goals when 
Other su- 


ng production 


Stupy 2 
The results of the 
necessarily indicate c; 


study was intended 
Sense the re 


that the investigation Was not pl 
explicit hypotheses concerning 
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supervisory styles regarding goal setting, but 
rather to furnish data for such hypotheses. 
The zero-order correlations between the 
independent and dependent variables T"! 
generally significant at the .05 level. pede 
ferred from the communalities, the reliabi uy 
of the measures was high except for beum 
ment plans and military experience, variab 5 
that did not affect data interpretation. bin 
theless, the raw r values were quite low. As ee 
result, a second study was planned. Preguen 
who had not participated in the first stu a 
were classified on the basis of the three eer 
Patterns: (a) producers who supervise iei 
men and set production goals, (5) egens 
Who set production goals only, and (c) P 
ducers who supervise their men but do not $ 
production goals, 
The purpose of 


as to 
the second study was 
test the h 


e le- 
ypothesis that supervision, as d 


: n 
fined by Staying on the job with the ad 
and setting a daily or weekly produc 


S S. d an 
goal, results in higher productivity th 


is ; in 
Supervision that does not include goal setting 


Or goal setting that is not accompanied 
supervision, 


Method 

Each year a larg 
week survey of 
who Supply it w; 


e 
* paper company conducts à segs 
all independent pulpwood poc ude 
y ith wood. The purpose of this stus 
S to identify physical variables such as p ues 
Wood, equipment, Weather, etc, that affect Laer 
livity. In 1970 the Survey was modified to pr 
questions Concerning specific job behaviors that the 
eia identified in Previous studies as critical to : 
Independent producer's job success (Latham, are 
1969b, 1970; Ronan & pote 1909. 1970), Ques 
tions relative to the 


E ows? 
present study were as ow pis 
ta) Does the Produc : 


1 
cr stay in the woods WIL g- 
men and set production goals? (b) Does the Pat 
ducer stay in the Woods with his men but n° 


Production goals? 
tion goals stay on the job with hi : 
Productivit ed as cords per man day jours 
is, total weekly Production divided by total cig 
Worked by the crew multiplied by cight.* The 


an- 
'S à constant that is used by the company 0 pe 
ardize the Man-day production for cach sree 
Operation. The data were collected at the job 5! D a- 
, The Survey was administered to the entire Polpot 
tion of full-time independent producers who SP o 
the company with pulpwood (N = 1000 individ me 
Data were analyzed for $97 producers since 7 


: roduc- 
(c) Does the producer set p men 


4 ? 
Measures of turnover, 


mm 
and inju" 
Were not Suitable for this s 


absenteeism, 
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TABLE 2 


Means, STANDARD Deviations, FOR PRODUCTIVITY UNDER VARIED FORMS OF SUPERVISION 


AND F VALUES (SCHEF. 


É TEST) 


Mean cord 
| per day | | 


Supervision (A) | 
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Goal setting (B) 28 | 
AX B 407 | 
*p xul. 
questionnaires were discarded because they were 


completed incorrectly. 


Results 


'Table 2 shows the means and standard de- 
viations for the productivity of each producer 
classification: producers who provide on the 
job supervision but do not set production 
goals, producers who set production goals but 
do not provide on the job supervision, and 
producers who set production goals and pro- 
vide on the job supervision. Producers who 
stay on the job with their men and set pro- 
duction goals had the highest cords per man- 
day production followed, respectively, by 
producers who set production goals but do 
not supervise their employees and producers 
who supervise their men but do not set 
production goals. 

An analysis of variance of the mean produc- 
tivity for each group of producers was highly 
Significant (F = 15.42, p< .01), indicating 
that method of supervision affects man-day 
Production. The Scheffé (1953) test was ap- 
plied to the data to determine which method 
Of supervision was superior. These results are 
also shown in Table 2. 

Examination of the remaining questions on 
the company’s survey indicated that the three 
Broups of producers did not differ appreciably 
1N the type of equipment they owned, the type 
9f wood they harvested, the terrain on which 

€y operated, or in the crew's years of 
*Xperience, 

The hypothesis that supervision as defined 
9Y staying on the job and setting production 
Soals results in higher cords per man day 

an supervision that does not include goal 
Setting was accepted. No significant differ- 
ences were found between producers who set 


1.34 | 
2.96 | 1.22 | s 
3.14 1.70 30.77* (A vs. A X B) 
production goals but do not supervise the crew 


and either of the other two classifications. 


DISCUSSION 


The findings of this study seem to provide 
support for a cognitive theory of motivation 
in an industrial setting, and they provide 
support for Ryan (1958, 1970) and Ryan and 
Smith’s (1954) contention that the “task” or 
“goal” be taken as the fundamental unit in 
motivation. This theory has been cogently 
summarized by Hilgard and Atkinson (1967): 
When an individual knows what he wants, 
knows the effort that will be involved in over- 
coming obstacles along the way, and knows 
what satisfactions the end state will bring, he 
can put his goals into action. Stated another 
way, “to the extent that an individual makes 
clear plans, is guided by his expectations and 
the risks involved, and moves steadfastly 
toward his goals, he is motivated by his cogni- 
tions [Ivancevich, Donnelly, & Lyon, 1970, 
p. 140].” 

Of further theoretical significance is that 
the two studies support the hypothesis origi- 
nally implied by Locke (1966) that super- 
vision can lead to maximum performance if 
it includes the establishment of specific per- 
formance goals. Supervision without goal 
setting did not correlate with any performance 
criterion in the first study, and it resulted in 
inferior performance in the second study. 
However, the first study indicated that the 
converse of the above hypothesis may also be 
true. Goal setting has a positive effect on per- 
formance in an industrial situation only when 
it is accompanied by supervision. Goal setting 
without supervision resulted in high labor 
turnover. Thus, it would appear that assigned 
goals do not affect performance unless a 
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supervisor is present to encourage their 
acceptance. 

These findings would seem to contradict the 
generally accepted conclusion that “closeness” 
of supervision has a detrimental effect on em- 
ployee performance. The apparent contradic- 
tion supports Patchen’s (1962) work, which 
showed that closeness of supervision was not 
the relevant parameter but rather that it 
was what behaviors and role the supervisor 
exhibited. 

Although the mean differences in produc- 
tivity in the second study between goal 
setting accompanied by supervision versus goal 
setting alone lend credence to the conclusion 
that goal setting affects performance in indus- 
try only when it is accompanied by super- 
vision, the differences were not significant. 
The lack of significance may be the result of 
the relatively small sample size of producers 
who set production goals only (n = 28), or it 
may reflect a misunderstanding by observers 
in interpreting the question 


- The low stan- 
dard deviations in Table 2 


tend to negate the 
of the second pos- 
ided in studies by 
ham, 1969a, 1969]. 
1969, 1970). Many 
ision for their em- 


Periods of generally no 
Paul d a hour per truck load. Although 
1970) 1 ents (Latham, 19692, 


or more each day with li i 
Finally, the present ; 
mentary Support for th LED. 


e Management by ob. 
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jectives philosophy advocated by Drucker 
(1954) and Odiorne (1965) in that they stress 
the need for and the interaction of supervision 
and goal setting. Effective supervision is more 
than simply remaining on the job to oversee 
subordinates. It involves specifying group 
goals, defining major areas of responsibility; 
and using these guidelines as a means for 
controlling and assessing job performance. 
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Implications of the res 


hypothesis, Type of pay system accounted 


for 40% of the variance in instru- 
variables (pay level, age, sex, marital 
ults for designing 
theory were discussed. 


expectancy 
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the role of organizational variables coul and 
used to modify organization structures ;ely 
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motivating work climate, istic 
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the type of System used to compensate de- 
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IMPACT OF ALTERNATIVE 


(merit vs. time) on the rated effort levels and 
Pay valence of nurses. They found that nurses 
in a hospital paying on merit were rated as 
expending more effort than nurses in a hos- 
pital paying only by time. No differences 
were found in the average valence attributed 
to pay across the two organizations. However, 
this study did not control for any interor- 
ganizational or individual differences such as 
age, experience, and pay levels of the partici- 
pating nurses. 

The present study was designed to investi- 
gate the impact of three different pay systems 
on employee perceptions of the (a) instru- 
mentality of performance for attaining pay 
and (b) valence of pay. The first hypothesis 
stated that instrumentality perceptions are 
highest among employees paid an individual 
incentive, next highest among employees paid 
a group incentive and lowest among hourly 
paid employees. This hypothesis is based on 
the expectation that the closer performance 
is linked to pay in actuality by the reward 
system, the higher will be pay-instrumental- 
ity perceptions. The second hypothesis stated 
that valence of pay is highest among em- 
ployees paid an individual incentive, next 
highest among employees paid a group incen- 
tive, and lowest among hourly paid employ- 
ees. This hypothesis follows Lawler (197 1) 
who suggested that objects have higher val- 
ence when they are attached to performance 
that requires effort. 


METHOD 
Sam ple 


The organization studied is a large producer of 
Consumer goods located in the midwest. The firm 
employed approximately 2,000 male and female pro- 
duction workers in its main facility, where this study 
Was conducted. About half of the work force was paid 
On an hourly basis, while the other half was paid on 
a piece-rate or group incentive system. Individual 
employees are allocated to jobs paid by the alterna- 
tive systems primarily on the basis of organizational 
Need rather than on the interests or desires of the 
employees, Thus, there is no reason to suspect any 
Systematic self-selection into various pay systems. 

A random sample of 350 employees stratified hy 
bay system was drawn from the total population of 
Workers, Of the total, 77 were absent on the day 
Of the survey, did not mail the questionnaire to the 
researcher as requested, or failed to adequately com- 
Plete the questionnaire. Thus, the final usable sample 
Was 273 or 78% of the initial sample. There were 
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61 hourly paid, $4 group incentive, and 128 piece- 
rate employees in the final sample. The participants 
were largely semiskilled machine operators and 
tenders. Their mean age was 35.6 years, their mean 
tenure with the firm was 6.4 years, and their average 
hourly wage was $2.98, Sixty-five percent of the final 
sample were females and 66% were married. 


Procedure 


Information about valence and instrumentality 
perceptions and personal characteristics of the sub- 
jects was obtained from a questionnaire used to 
gather data for a larger study. Questionnaires were 
administered at the plant on company time by a 
graduate student from the University of Wisconsin. 
The project was identified as a research study and 
its voluntary nature was stressed. Subjects were 
asked to send their questionnaires directly to the 
university to insure that officials of the firm would 
not have access to their responses. 

A modified paired-comparison procedure was used 
to measure the dependent variables’ instrumentality 
and valence. The instrumentality responses question 
read: How much weekly pay I get depends more on? 
The alternative indicating high instrumentality was 
how much I produce, and the five alternative deter- 
minants of pay were my seniority, my job level, the 
union, overtime and hours of work, and my co- 
workers. A respondent who chose how much I pro- 
duce in three of these comparisons was given an 
instrumentality score of three, a respondent choosing 
that alternative in four comparisons was given an 
instrumentality score of four, and so on, The valence 
of the second-level outcome (pay) was measured 
using the following question: Which is more impor- 
tant to me in my ideal job? The alternative indi- 
cating high valence of compensation was making 
good money. It was compared with each of five other 
possible second-level outcomes: steady job, friendly 
co-workers, good supervision, doing work I like, and 
opportunities for promotion, Responses were scored 
in a fashion analogous to the scoring of the instru- 
mentality question. (Additional information on these 
measures including test-retest stability evidence will 
be found in Schwab & Dyer, 1973.) 

Information was also obtained on five personal 
characteristics. These characteristics were chosen to 
be controlled on the basis of previously demonstrated 
relationships with instrumentality or valence or be- 
cause it was hypothesized that they might be related 
to those perceptions. The control variables chosen 
and their hypothesized relationships with valence and 
instrumentality were pay level (negatively related 
with valence, Lawler, 1971), age (negatively related 
with instrumentality and valence, Meneman, in 
press), sex (higher valence among males, Lawler, 
1971), marital status (higher valence among the mar- 
ried), and tenure with the organization (negatively 
related to valence). Data on these variables were 
obtained on the questionnaire. In addition, the inde- 
pendent variable—type of pay system—was obtained 
from personnel records. 


310 


TABLE 1 


MULTIPLE AND PARTIAL CORRELATION ANALYSIS OF 
THE Impact OF Pay SYSTEMS ON INSTRUMENTALITY 
AND VALENCE PERCEPTIONS 


Instrumentality Valence 
Variable Con- | Controls} Con- | Controls 
trols |andpay| trols |and pay 
only | system only | system 
Piece rate = .64** = PAW hai 
Group incentive — E bed — -14* 
Wage level AS* 04 .18** .14* 
Age —07 |—02 |—.08 | —.06 
Sexe —19**| —.06 |—.08 | —.05 
Marital status’ | —00 | —01 —.05 —.06 
Tenure 43 | —.09 16% 18 
R .04* 44%")  .07**| — 1095 


à Dummy variable, female = 0, 
Dummy variable, married = 0, 

tb «.05. 

put. 


Analysis 


control variables are treated 


workers, 
nificantly 


(¢= 14.45 and 10.74 


The analysis re 
umns of Table 1 
chances of errone 


Ported in the first 
à twi - 
Was designed t a 


; 9 reduce 
‘ously attributing din 
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in instrumentality perceptions to pay systems 
if differences in personal characteristics d 
responsible. The first column shows the partia 
correlations and the variance-explained esti- 
mate when instrumentality scores were re- 
gressed on the five control variables alone. 
Only wage level and sex are significantly 
related to instrumentality perceptions. High- 
instrumentality perceptions are associate 
with high wages and females. Both observa- 
tions are probably spurious, however, since 
Sex and wage level are correlated with type 
of wage system. In all, the control variables 
account for only 4% of the instrumentality 
variance (R — 20). 

The probable spurious nature of the irme 
mentality correlations with sex and wage leve 
is supported by an examination of the secon 
Column in Table 1, where pay system has 
been introduced into the regression ie unir] 
Tn this analysis, only the wage system partia 
correlations are significant, These two CO 
relations are to be interpreted as showing the 
magnitude, in correlation terms, of the differ- 
ence in instrumentality scores between the 
hourly pay group and each incentive grouP 
holding the other incentive group and all con” 
trol variables constant.? The amount of instr” 
mentality variance explained when all pe 
ables are included is 44% (R= .66).A E 
of the incremental R? (Cohen, 1968) showe: 
that type of wage System accounts for à a 
nificant amount of instrumentality perceptio! 
Variance after the five variables have bee? 
accounted for (F = 88.64, p < .01). 


sad 
The second part of the analysis examine’ 
the relationship b r 


etween pay system and pay 
- alence scores of individu " 
(x = 2.88 on a 5.9 scale) and group ing 
tive (X = 2.85) workers were significant 
higher and 2.46, respective?’ 
; Tespectively) than in the- 
Y group (X = 2.24) as hyp? on 
Ut there were no significant me 


a 


? Wa 


jable 
for 


E ia var 
ge system was coded in a dummy V 


sing f 
mat which involves successively dichotomiziné 
Values into 1-1 groups. In the case at hand. Peed 
ied and group incentive values were dichoto™ tbe 
Caving the hourly paid group to be treated a yen! 
alpha constant. Cohen (1968) provides an XC po- 


5 : ye 
discussion of the use of dummy variables in P? 
logical research, 


Impact OF ALTERNATIVE 


differences between the two incentive groups 
(£< 1.0). 

Column 3 of Table 1 shows that two con- 
trol variables—wage level and tenure—are 
Positively and significantly related to pay 
valence when the other control variables are 
partialed out, The fourth column also shows 
there is a difference in valence scores between 
hourly and each incentive group even after 
the control variables are accounted for, as 
evidenced by the significant piece-rate and 
group incentive partials. Moreover, the incre- 
ment in the variance explained between the 
control variables only and the control plus 
pay system variables is significant (F — 4.41, 
p < .01). Thus, type of pay system is shown 
to influence pay valence perceptions after five 
personal characteristic variables are accounted 
for. The magnitude of the increase 3%, 
however, is small. 


Discussion 


Tt was hypothesized that pay valence and 
instrumentality perceptions would be highest 
among piece-rate employees, followed by 
group incentive, and lowest among hourly 
paid employees. The hypothesis was con- 
firmed with respect to instrumentality per- 
ceptions. Piece-rate workers generally saw 
weekly pay as being more a function of their 
performance than seniority, job level, the 
Union, overtime and hours of work, and co- 
Workers, To a lesser extent, this was also true 
Of group incentive workers. Hourly paid 
Workers alternatively tended to choose expla- 
Nations other than performance as the major 
Sources of their weekly pay. 

It is interesting to contrast the instru- 
Mentality results obtained on the piece-rate 
Workers here with results from a study by 
Georgopoulos, Mahoney, and Jones (1957). 
They investigated the predictability of self- 
report performance data using pay valence 
and instrumentality and a measure of per- 
ceived freedom among 621 piece-rate opera- 
lives, A secondary analysis of the instru- 
mentality data shows that only about a third 
Of the workers felt that their performance 
resulted in increased long-run earnings. The 
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apparent differences in instrumentality per- 
ceptions between the two studies may be due 
to differences in the communication and ad- 
ministration of the incentive system in the 
two organizations. Alternatively, the differ- 
ences may be due to the difference in time 
perspective of the instrumentality questions. 
The long-run focus of Georgopoulos et al. 
(1957), in contrast to the one-week time 
referent in the present study, may have raised 
employee concerns about increases in produc- 
tion standards and/or job security. Additional 
research using both short- and long-run time 
perspectives could make a valuable contribu- 
tion toward our understanding in this area. 

The hypothesis relating pay-valence per- 
ceptions and pay system was only partially 
supported. Subjects in both incentive condi- 
tions had significantly higher valence scores 
than hourly paid employees. There were, how- 
ever, no differences between the two incentive 
groups. In addition, the valence variance ex- 
plained by type of pay system, although sig- 
nificant, was small. Regressing valence on pay 
system alone explained only 3% of the vari- 
ance in valence. Pay system also added 3% to 
variance explained when it was entered in the 
regression equation following the control vari- 
ables. Thus, while the results showed that 
type of pay system did influence pay valence, 
its impact was not substantial. 

The results obtained between valence and 
control variables did not conform to expecta- 
tions. In the present study, males did not 
value pay more than females, older employees 
did not value pay less than younger em- 
ployees and lower paid employees did not 
value pay more than higher paid employees 
as has been found previously (Lawler, 1971). 
Indeed, tenure and pay level were positively 
and significantly related to pay valence. 

This study should be of particular interest 
to those who have argued that the motiva- 
tional consequence of incentive systems are 
often dysfunctional (e.g., Whyte, 1955). The 
findings reported here suggest that incentive 
systems result in a substantially higher per- 
ceptual link between performance and pay 
than does payment by time. To a much lesser 
extent, incentive pay systems appear to have 
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a positive impact on pay valence. High levels 
of both instrumentality and valence are pre- 
sumed to have positive motivational implica- 
tions in expectancy theory. 

The results should not, however, be 
interpreted as suggesting that incentive sys- 
tems have only desirable motivational conse- 
quences. For example, it has been argued that 
high performance in an incentive system may 
lead to peer group disapproval. Moreover, the 
degree of task specialization which frequently 
accompanies wage systems may reduce oppor- 
tunities for the individual to obtain intrinsic 
rewards from performing. The incentive sys- 
tem thus may not have the desired motiva- 
tional impact for those individuals who find 
peer group disapproval unattractive and/or 
intrinsic rewards attractive. This might occur, 
depending on the magnitude of the various 
valences and instrumentalities, even though 
the incentive system influenced instrumental- 
ity of pay and pay valence as it appeared to 
in the present study. These Possibilities sug- 
- gest that future research on the determinants 
of motivational Perceptions should examine 
the impact of various personal and organiza- 
tional characteristics on instrumentality and 
valence perceptions for a variety of work- 
related outcomes, not just pay. 


Dowarp P. SCHWAB 


REFERENCES 


Conex, J. Multiple regression as a menl dg 
analytic system. Psychological Bulletin, 1968, 
426-443. 

GroncoPouros, B. S, Manowrv, G. M., & m 
N. W. A path-goal approach to produc s 
Journal of Applied Psychology, 1937, 41, 3 T ene 

Heneman, H. G., III. The relationship betwee NS 
and the motivation to perform on the J 
Industrial Gerontology, in press. i ion 

HeneMan, H. G., III, & Scuwas, D. P. Leg 
of research on expectancy theory predic ad 
employee performance. Psychological Bulletin, 

78, 1-9. w 

LAwLER, E. E., III. Managers’ attitude toward bon, 
their pay is and should be determined. Jour? 
Applied Psychology, 1966, 50, 273-279. ctivt- 

Lawrrn, E. E., III. Pay and organizational esi. 
ness: A psychological view. New York: McG 
Hill, 1971. 2 

Orsant, R. L, & Dunyetre, M. D. Role 
compensation in industrial motivation. 
logical Bulletin, 1966, 66, 94-118. relate 

Scuyeiner, B., & Orson, L. K. Effort as a Co vidua 
of organizational reward system and indiv 
values, Personnel Psychology, 1970, 23, 315 ] im- 

Scuwas, D. P., & Dyer, L. D. The motivation? per- 
pact of a compensation system on employ gn 
formance. Organizational Behavior and 
Performance, 1973, 9, 215-225. york: 

Vroom, V. H. Work and motivation. New 
Wiley, 1964. 


Jork: 
Wuyrte, W. F. Money and motivation. New y 
Wiley, 1955, 


of financial 
Psycho- 


(Received August 14, 1972) 


| 


oe d^ 


iy 


Journal of Applied Psycholo 
1973, Vol. 58. No. 3, 313-317 


ACHIEVEMENT MOTIVATION, OCCUPATIONS, AND 
LABOR TURNOVER IN NEW ZEALAND 


GEORGE H. HINES? 


Victoria University of Wellington, New Zealand 


A nonprojective measure of achievement motivation was used to investigate 
the relationship among need for achievement (n Ach), labor turnover, and 


occupations in New 
engineers, accountants, 


Zealand. Questionnaire results from 315 entrepreneurs, 
and middle managers revealed low turnover among 


high n Ach self-employed subjects. High-turnover subjects displayed signifi- 
cantly higher achievement motivation levels than low-turnover subjects. Among 
engineers, accountants, and middle managers, those with high n Ach had high 


labor mobility rates. 


Results were supportive of McClelland's theory and 


demonstrated the feasibility of extension of the model through use of non- 


projective research methods. 


The concept of need for achievement (n 
Ach) has been extensively developed by Mc- 
Clelland and his colleagues (McClelland, At- 
kinson, Clark, & Lowell, 1953). Using pro- 
jective techniques to measure achievement 
motivation, researchers have extended the 
study of n Ach to many diverse cultures 
(Heckhausen, 1967; Hines, 1972; Storm, 
Anthony, & Porsolt, 1965). The Thematic 
Apperception Test, which has been the prime 
means of measuring n Ach, has been criti- 
cized for practical limitations (Smith & Field, 
1958) that have inhibited more widespread 
exploration of the theory. Problems of inter- 
rater reliability, rigorous scorer-training re- 
quirements, and lengthy scoring procedures 
have led to the search for nonprojective tech- 
niques to supplement the Thematic Appercep- 
tion Test (Costello, 1967; Hermans, 1970; 
Holmes & Tyler, 1968; Lynn, 1969). The 
Lynn Achievement Motivation Questionnaire, 
an eight-item inventory, has been used with 
Promising results on four continents (Iwawaki 
& Lynn, 1972; Melikian, Ginsberg, Ciiceloglu, 
& Lynn, 1971). The work of Holmes and 
Tyler indicated that n Ach is a conscious 
phenomenon and therefore subject to direct 
self-report. These findings provided experi- 
mental evidence that questionnaire measures 
of n Ach could elicit valid data, thus clearing 
the path for large-scale research on achieve- 


ment motivation. 


* Requests for reprints should be sent to George 
3 Hines, Victoria University of Wellington, P. O. 
Ox 196, Wellington, New Zealand. 


According to achievement motivation the- 
ory, the entrepreneur is likely to be high in n 
Ach (Hornaday & Bunker, 1970; McClelland, 
1965). The manager who works for someone 
else, on the other hand, tends to be high in 
need for power and significantly lower on the 
n Ach scale (McClelland & Winter, 1969). 
The professional engineer and accountant dif- 
fer from entrepreneurs and managers in that 
the standards by which the former are judged 
are known to be set by their colleagues and 
professional societies (Hines, 1971). While it 
has not been established if high achievers are 
over- or underrepresented in the professions, 
it seems reasonable to suggest that high 
standards, clearly defined goals, and oppor- 
tunities for recognition would attract at least 
the normal n Ach distribution. 

Labor turnover is very high in New Zea- 
land (New Zealand Department of Statistics, 
1971), with an annual national average for 
males of over 60%. The turnover of entrepre- 
neurs, however, would be expected to be small, 
indicative of the relative labor stability of 
those who own their own business. Managers 
and professionals are more likely to have 
fluid employment patterns, reflecting such 
factors as job satisfaction, pay, personal am- 
bitions, and geographical preferences. Among 
these latter occupational groups, individuals 
high in n Ach would be expected to exhibit 
higher turnover than those low in achieve- 
ment motivation. This prediction is made on 
the basis of n Ach theory, which holds that 
those high in achievement motivation will 
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continually seek self-improvement. In light of 
New Zealand’s full employment conditions 
and the prevailing high turnover rates, those 
who would most quickly seek to improve their 
circumstances should be those high in achieve- 
ment motivation. 

This study was designed to examine the 
feasibility of using Nonprojective methods as 
a means of extending achievement motivation 
theory, An investigation of the relationship 
among achievement motivation, occupations, 
and labor turnover in New Zealand was se- 
lected to fulfill that objective. The following 
specific hypotheses were tested: 


1. Entrepreneurs have higher levels of 
achievement motivation than individuals in 
other occupations. 

2. Entrepreneurs have lower labor turnover 
rates than individuals in other occupations. 

3. Middle managers have higher labor turn- 
over rates than entrepreneurs, engineers, and 
accountants, 

4. Engineers, accountants, and middle 
managers who are high in achievement moti- 
vation will exhibit higher labor turnover 
rates than those who i 
motivation, 


Questionnaire 
as been demon- 


vels of achieve- 
reneurs, mana, 


ively discriminate le 


adings 
ults generally Supportive of 


land, 1961) ha of n Ach theory (McClel- 


vi 
ghanistan, Brazil, S 
(Iwawaki & Lynn, 
scale items were in 


een 


audi Arabi 
1972; iki 


; 
cluded in a questio 


sey, and Japan 
+ 1971), The 
nnaire mailed 


Occupation M SD 
n 
ae d 
Entrepreneur 5.47 1.44 lm 
Engineer 4.68 1.33 » 
Accountant 4.58 148 Es 
Middle manager 4.03 1.60 "A 
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to a random sample of New Zealand business 
part of a large national survey of management B x 
tudes. The subjects were asked to circle yes 
“no” alternatives to the following questions: 


1. Do you find it easy to relax completely when 
you are on holiday? uic 

2. Do you feel annoyed when people are not p 
tual for appointments? 

3. Do you dislike seeing things wasted? 

, ike getting drunk? 4 

5 m ds Lj {i En to forget about your work 
outside normal working hours? sy bub 

6. Would you prefer to work with a congenial E 
incompetent partner, rather than with a difficult 
highly competent one? 

7. Does inefficiency make you angry? be 
8. Have you always worked hard in order to 
among the best in your own line? 

Scoring was on th 


answi 
“ 


” 
e basis of 1 point for Kis 
ers to Questions 2, 3, 7, and 8, and 1 point bor 
no” answers to Questions 1, 4, 5, and 6. Leoni 
turnover rates and demographic data were postre 
The subjects consisted of 80 entrepreneurs beret 
ing started a business where no prey? ts 
xisted), 74 engineers, 68 accountam 
and 93 middle managers (employed on a ow 
basis below senior management level and above 2 us 
line Supervisors), Response rates for the es 
categories were entreprencurs, 54%; engincers, 53 pe 
accountants, 41%; and middle managers, 63%. its 
Questionnaire, along with a cover letter explaining js 
Purpose and a self-addressed stamped envelope; M 
mailed during the period November 1971 to uer 
ary 1972, Participation was voluntary and anon» 
ity was guaranteed, in- 
A pretest Was conducted on 342 business admi * 
istration Students at. Victoria University of he 
ton, New Zealand, and the results were compa 
with the British sample used by Iwawaki and LY en 
(1972), There were no significant differences petya 
the New Zealand (N =342; M =4.52, SD=1- Z 
and British university (N=622; M —446, gu 
1.70) samples (¢ = 57, ns), i 
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‘TABLE 2 
ACHIEVEMENT MOTIVATION LEVELS BY 
OCCUPATIONAL GROUP 


'TABLE 4 


ACHIEVEMENT MOTIVATION LEVEL BY 
LABOR TURNOVER RATE 


Achievement motivation level 


Occupation 


Achievement motivation level 
Labor turnover 


R t 
Low Average High BE Low | Average High 
Entrepreneur 11 40 29 Low 28 | 46 20 
zin eer 14 42 18 Average 40 81 33 
ccountant 17 31 20 High 8 
Middle manager 34 45 14 = : = b 
Total 10 158 sl 
d low turnover, as opposed to 31.2% of middle 
inter Eee score of 0, 1, 2, managers having high turnover. Statstical 


or 3 on Lynn verage 


= score of 4 or 5; hig! 


a significant difference at the p < .05 level 


(t = 2.24). 

Tables 2 and 3 depict achievement motiva- 
tion levels and labor turnover rates by occu- 
pational groups. High and low n Ach levels 
were determined on the basis of mean entre- 
preneur and middle manager scores. Individu- 
als who scored 6 points or more were as- 
signed to the high-achievement-motivation 
category, while those who scored 3 points or 
less were placed in the low-achievement-moti- 
vation group. This division resulted in 25.7% 
“high achievers” and 24.1% “low achievers.” 
Reference to Table 2 shows that 36.3% of all 
entrepreneurs were rated as high in n Ach, 
while 36.6% of middle managers were rated 
as low in n Ach. Chi-square analysis re- 
vealed significant differences among the groups 
Ay? = 1745, df= 6, p Z o1). 

'The results in Table 3 confirm Hypotheses 
2 and 3, with 72.596 of entrepreneurs having 


TABLE 3 


BY OCCUPATIONAL GROUP 


LABOR TURNOVER RATE 


Labor turnover rate 


Occupation ; 

| Low Average High 
Entrepreneur 58 16 6 
Engineer 10 50 14 
Accountant 12 38 18 
Middle manager 14 50 29 
154 67 


Total 94 


= one job in five year; 


Note. Low, turnover rate i 
= three or more jobs 


two jobs in five years; high 


analysis indicated a highly significant turn- 
over rate by occupational group differences 
(x? = 98.31, df = 6, p < .001). 

Table 4 illustrates that among the high- 
turnover group, there were 350% more sub- 
jects with high n Ach than with low n Ach. 
Overall differences in the table proved to be 
significant (x? = 14.91, df = 4, p < .01). Ta- 
ble 5, which is an occupational breakdown of 
Table 4, confirms the final experimental hy- 
pothesis. In the pooled engineer—accountant— 
middle-manager group, 48% of those with 
high n Ach also displayed high turnover. In 
the low-turnover category, only 19% of the 
pooled group had high n Ach. 


DISCUSSION 


The results of this study indicate that en- 
gineers, accountants, and middle managers 
who are high in n Ach exhibit greater labor 
turnover than those who are low in n Ach. 
This would appear to reflect a lack of per- 
ceived opportunity to achieve or to exceed 
some self-imposed standard of excellence in 
their job performance. The low achievers, on 
TABLE 5 


ACHIEVEMENT MOTIVATION LEVEL BY LABOR TURN- 
over RATES FOR OCCUPATIONAL GROUPS 


Achievement motivation level 


Labor T 
turnover Entrepreneurs Engineers, accountants, 
rate and middle managers 
Low | Average | High | Low | Average | High 

Low 7 28 23 | 23 3 | 10 
Average 2 11 3 36 85 17 
High 2 f 3 6 30 25 
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the other hand, tend to have low turnover. If 
one accepts McClelland’s argument that those 
with high n Ach do not necessarily make the 
best managers, the results for the middle 
managers lend indirect support to the general 
model. Although there is no evidence to con- 
firm that the low-turnover, low n Ach N 
Zealand managers in this study are in fact 
successful managers, their pattern of stability 
in employment Suggests that their companies 
are satisfied with their performance. It seems 
reasonable to hypothesize that these indi- 
viduals may conform to the low n Ach, high 


ew 


The high level of n Ach in New Zealand 


nal patterns 
model. The 
entrepreneur is typically portrayed as a mod- 
d by a sense 
and by a need for 
Its as measured by 
tudy do not permit 


Ccess or failure of 
but the high 


t i tendency 
engineers, account; 
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ould p, 


to se 
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well he is doj i 
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i ariable—the organizational 
climate in which an individual Spends his 
work day and makes his decisions. Based on 
the results of this i igati 

Comparisons among 
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levels of achievement motivation. Gien d 
unique motivational characteristics of t = rs 
Zealand business environment, particular d 
full employment and egalitarian ethos, High 
findings should be treated with Caution: i 
achievers working in a higi Afer OIE 
situation quite probably would not and A 
not change jobs easily or for Lm n = H 
Such practice is common in New renta a 
is the tendency for employers to peces 
employees equally regardless of formal bem 
cation or potential contribution (Hines, 19 i 
It seems clear, however, that a 
theorists should give careful attention to by 
differential behavior of those manyas 4 
varying degrees of n Ach. The deve € 
ment of nonprojective techniques to measu s 
n Ach makes such large-scale exploratory T! 
Search both practical and rewarding. 
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Longitudinal data that were 
assess the validity of retrospective re 
munication, and coordination. These 
400 beds), community, 
data were 


before-after change, it was 
accurate enough to be consid 


Retrospective questions concerning changes 
in attitudes and personal data have been used 
as substitutes for before-after Change mea- 


Sures, usually with explicit caveats by the 


authors about the validity of the perceived- 
Change measures 


(eg. Baumgartel, 1954; 
Hardin, 1960; Mann & Hoffman, 1960; and 
Whyte, 1951), 
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attitudinal data, 


Two studies 
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gathered for a three-year period were used to 
ports of changes in job satisfaction, com- 
data were collected from a large (over 
al in which intervention occurred. The 
and contingency analyses. Although it 
5 of change to some e 
also noted that retrospective measures are not 
ered as substitutes for computed-ch. 


ent do measure 


ange measures. 


x. ton 
Hardin (1965), studying changes in apk 
satisfaction from the beginning to the p 
a six-month period (before and after a rrela- 
nological change), found a significant cor redi 
tion between the computed- and aee 
Change measures (+ = 28, p < .01). -— 
sion analyses revealed this correlation change 
to the strong relationship of perceived ch vith 
with final satisfaction, rather than juded 
before-after change. Hardin (1965) conce 
that the quasi-longitudinal design using titte 
spective questions is a very weak subst! 


: p uns E study 
for a genuine longitudinal design to 5 
change, 


À retro 
This paper analyzes the accuracy of ! 


: I eye 
Spective perceptions of change for a three-?, 


4 n, 
Period in measures of (a) job satisfac 
(5) Communication, and (c) cnt ue 
Comparisons are made with computed ae and 
derived from measurements taken before i 
after an intervention in a hospital setting 


METHOD 


Site and Respondents g 
. 4 rep 
Mailed questionnaires were completed EE ity 
tered nurses in a large (over 400 beds), P 65; 
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the response rate 
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and were assured of the e who i 
nses, There were 102 muna his 
the first and second questi sed in f 
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analysis, while 12 had some missing data. These 12 
Were eliminated from the analysis. 


Intervention 


During the spring and fall of 1966, the survey 
results were presented sequentially to one group con- 
sisting of the hospital and nursing administrators 
and supervisory nurses, one group of head nurses, 
and seven groups of staff nurses, All data-feedback 
groups were homogeneous in level but heterogeneous 
in shift and work location. There were 50 meetings 
in which the survey data were presented and dis- 
cussed; at these meetings, suggestions for the solu- 
tions of problems were encouraged. Following these 
interventions, staff nurse turnover rate decreased and 
stabilized at approximately one half of its previous 
rate in this hospital, while the rate for seven other 
hospitals in the same area remained unchanged. 


Measures 


In both questionnaires the respondents were asked 
the following questions: 


1. Considering your job as a whole, how well do 

you like it? 
(a) I don't like my job at all, (b) don’t like it 
too well, (c) like some things about it, dislike 
others, (d) like it fairly well, (c) I like my job 
very much. 

2. When decisions that affect your work are made, 
on the whole, how adequately are such decisions 
explained to you? 

(a) inadequately, (b) somewhat inadequately, 
(c) fairly adequately, (d) very adequately, (e) 
completely adequately. . 

3. To what extent are the various interrelated 
things and activities well-timed in the everyday 
routine of the hospital? i 
(a) they are rather poorly timed, (b) they are 
not so well-timed, (c) they are fairly well- 
timed, (d) they are well-timed, (e) all related 
things and activities in the everyday routine 


are perfectly well-timed. 
luded a question asking 


king in that particular 
ell as these 


The second. questionnaire inc 
if the respondent were wor 
hospital during the fall of 1965 as w 
retrospective questions: 


. Considering everything, would you say you are 
; Oe aaa or less satisfied with your 
job than you were three years ago? 
(a) much less satisfied now with my job than 
three years ago, (b) less satisfied now, (c) no 
more, no less satisfied, (d) more satisfied now, 
(e) much more satisfied now than three years 
ago with my job. de h . 
2. Do you feel that decisions affecting your job 
are more adequately explained to you now than 
three years ago? 
| (a) pace ote explained much less adequately 
than before, (b) less adequately, (¢) no more, 
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no less adequately, (d) more adequately, (c) 
much more adequately than three years ago. 

3. Do you feel that the various interrelated activi- 
ties in the everyday routine oí the hospital are 
better or more poorly timed now than three 
years ago? 

(a) the various activities are much more poorly 
timed now than three years ago, (b) more 
poorly timed now, (c) no better, no worse, (d) 
better timed now, (e) much better timed now. 


A measure of computed change for each of the 
sets of items was formed by subtracting the first 
questionnaire response value from the second. 

Individuals who gave the same extreme response 
to both an initial and a final status item were elimi- 
nated from the analysis. This procedure, used also 
in Hardin's (1965) analysis, was necessary to prevent 
spurious attenuation of relationships due to bounded 
limits of the computed-change scale. Compared 
io results using the complete sample, the effects 
of this procedure were negligible for percentages 
of respondents classified alike on perceived- and 
computed-change measures, and correlations between 
these measures differed substantially only for job 
satisfaction, when attenuation would have lowered 
the correlation from .41 to .29. 


RESULTS 


For each of the three content areas, cor- 
relations among the measures of perceived 
change, computed change, initial status (be- 
fore intervention), and final status (after 
intervention) are presented in Table 1. When 
each respondent was coded as decreased, no 
change, or increased on each change measure, 
contingency groupings of the perceived- 
change and computed-change measures were 
computed. 


Job Satisfaction 


Differences in perceived change in job satis- 
faction were significantly and positively cor- 
related with differences in computed change 
(see Table 1). This correlation was not due to 
the perceived-change measure being related 
only to final status, as the low correlation 
between the status measures indicates. In fact, 
comparison of the status measures' absolute 
correlations with perceived change reveals 
that initial status contributed most to the 
perceived- and computed-change correlation. 
Inspection of the means and variances of the 
status measures explicated these perceived- 
change and status correlations. No mean 
change was evident (f= 24, df = 46), 
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Whereas a test for correlated variances indi- 
cated that the variability of actual job satis- 
faction was significantly greater for the initial 
measurement as compared to the final mea- 
Surement (¢ = 2.42, df = 45, p < .05). Thus, 
those respondents who were Satisfied before 
the intervention perceived themselves to have 
become less satisfied, whereas those respon- 
dents who were dissatisfied Perceived them- 
selves to have become mo 

When contin, 


remaining 49% were 
classified as changed by one measure and not 
changed by the other, 


has with the 
change measures, 

A contingency groupi 
sures of job commun; 
similar to that for 
46% were classified alike 
in opposite directi 
fied as changed o 
the other, 


nitial status 
Dal status and perceived- 


5 persons, 
classified 
re classi. 
t not on 


: 796 were 
ons, and 47% we 


n one measure bu 


Job Coordination 


Perceived change in job coordination M 
not related to the status measures or ere 
puted change, as shown in Table 1. The ex 
nificant correlation between the status ane 
Sures suggests that change in job coordina 
Was minimal, ea- 

Contingency grouping of the change m dee 
Sures of job coordination revealed that se 
spite “minimal change,” the retrospec i 
measure was similar to previous measures zi 
identifying initial and final status d quc 
Of 88 persons, 41% were classified by s 
876 were classified as changed in opposite M 
rections, and 51% were classified as chang 
9^ one measure but not on the other. 


Discusston 


The correlation between perceived- es 
computed-change measures was highest for Le 
Satisfaction. Unlike Hardin’s (1965) par 
this correlation WaS not simply due to ith 
perceived-change Measure’s relationship W e 
final status, Rather, this measure [rt 
reduction in the variability of job satisfac nd, 
from initial to final Status. On the other e 
the perceived measures of communication tors 
Coordination were not as suitable as a in 
of before—after change, Perceived poa ie 
communication was significantly pn its 
with computed change, but this was rs not 
relationship with final status and di 
reflect before-after change. esent 

second method of assessing the P™ each 
data involved classifying respondents on 
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of the perceived- and computed-change mea- 
Sures as decreased, no change, or increased. 
The results obtained by this method in this 
study and from the Hardin (1965) and Fink 
(1960) studies are quite similar in that ap- 
proximately 40% of the respondents were 
classified alike on each measure, from 5% to 
1396 as changing in opposite directions on the 
lwo measures, and the remaining respondents 
as changing on one measure but not on the 
other. However, to conclude from these simi- 
larities that the perceived-change measures 
in the present study are completely inaccu- 
rate as indicators of before-aíter change is 
not correct. A comparison of the accuracy 
Of perceived-change measures in classifying 
before-after change reflected substantial dif- 
ferences between the three studies. The job 
satisfaction and communication measures cor- 
rectly identified large percentages of those 
respondents whose before-after response in- 
creased (75%) or did not change (7096), 
respectively. These accuracy percentages de- 
viated significantly from expected proportions 
under chance conditions of one-third accurate 
and two-thirds inaccurate for job satisfaction 
(821237, df=1, P< 001, n= 16) and 
communication (x? = 18.19, df= 1, 7 « .001, 
n= 30). In contrast, the perceived-change 
measures in the Hardin (1965) and Fink 
(1960) studies were relatively inaccurate as 
indicators of before—after change. Only Fink’s 
measure deviated from chance expectancy 
by classifying a significant percentage (41% ) 
of those respondents whose before-after re- 
Sponse did not change (X^ = 14.87, dj — 1, 
P < .001, n = 707). However, this ignificance 
Was influenced strongly by sample size as 
reflected by a $ coefficient index (.13) com- 
Puted from the values of chi square and 
Sample size. 

Although the present study does not con- 
firm Hardin's finding that perceived-change 
Measures merely reflect final or present status, 
lt does support his conclusion that perceived 
Change is not an adequate substitute for 
before-after change. The two methods of as- 
Sessing the present data indicated that per- 
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ceived change reflected before-aíter change on 
two of the three measures. Nonetheless, even 
these measures were far from one-to-one sub- 
stitutes for the computed-change measure. As 
mentioned above, the job satisfaction and com- 
munication measures identified accurately large 
percentages of before-after responses, but only 
for one of the three classifications of before- 
after change. Despite the greater cost and 
difficulty of a genuine longitudinal design, it 
appears that this design should be used to 
assess before-after change, rather than a 
quasi-longitudinal design employing retrospec- 
tive reports of change. 

Finally, the present study only investigated 
retrospective change as a substitute for 
before-after change. Other empirical issues 
beyond the scope of this paper are the valid- 
ity of change perceptions as predictors of 
future attitudes or behaviors and whether 
perceived-change measures might not be more 
stable reports of past change than the usual 
before-after measures. 
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Certain problems associated with the use of deficiency 


research are presented, 
and their operation is 
correlation and 


Since the influential studies of Porter 
(1962) on need deficiencies among managers, 
investigators of Job attitudes have made in- 
creasing use of (for exam- 
ple, 


: McNemar, 
inn, 1970), Cronbach and 
e observed that “although 
f such scores has long been 
ed, even by 
Investigators 


1958; Werts & Ti 
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ans have been serj- 


datum may be Misinter 
research, First, by using si 
is demonstrated 


1The authors wish to acknow 
advice offered by Js Clarkson e e 
* Requests for reprints Should be sent 
Wall, MRC Social and Applied Psycholo 
University of Sheffield, Sheffield, S10 2 ME. 


valuable 
Omer, 


scores in job attitude 


A logical and a Psychological constraint are identified 
demonstrated with empirical and simulated studies, Part 
partial correlation techniques a: 
interpreting relationships between deficiency 


Te recommended as ways of 
Scores and other variables, 


Straints inherent in the derivation of deli- 
ciency scores, obtained relationships between 
Such scores and an independent variable may 
reflect no more than the relationship -— 
one of the two component measures of th 
deficiency Score and that independent varia- 
ble. Second, it will be shown that psycholog!- 
cally meaningful results may be masked due 
to the operation of the constraints inherent 3 
deficiency scores, Finally, partial correlatio 
techniques will be applied to a recent stu 
by Wanous and Lawler (1972) to illustrat 
how misinterpretations can occur. 


N 
Constraints INHERENT IN THE DERIVATIO 
OF DEFICIENCY Scores 


When a rating of the existing level a i 
job characteristic is subtracted from a desi A 
level rating, a logical constraint comes int 
effect. If a 7 
5 on the existing level "€ 
Obtain a deficiency score of —4 to 2, whereas 
2 may obtain a deficiency 
Deficiency scores for e 
existing levels of a P 
l consequently tend ue 
eficiency scores of Aud 
í ived existing levels. All o 
things being equal, there will be a negati 
relationship between existing level scores Sod 
deficiency Scores, the deficiency score em 
Strongly, though not completely, determin 
py the existing level score. It follows that e 
independent variable which is positively ' 
lated to the existing level score will ten ore 

s negatively related to the deficiency wii 
9r No other reason than that due to its pe i 
nal positive relationship with the existing Je 
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Score. An example serves to illustrate the 
Implications of this argument. 

Porter (1962) found that deficiency in es- 
teem, autonomy, and self-actualization was 
negatively related to managerial level. If we 
make the reasonable assumption that higher 
level managers tend to have more of these 
three job characteristics than do their less 
senior counterparts (data relevant to this as- 
sumption are not reported by Porter), then, 
because of the logical constraint, a negative 
relationship between managerial level and de- 
ficiency scores will automatically occur. It 
may well be that Porter’s original results 
reflect no more than a strong relationship 
between managerial level and the reported 
existence of esteem, autonomy, and self-ac- 
tualization. If this is so, then an interpreta- 
tion in terms of deficiency in need fulfillment 
is misleading. 

A second constraint affects the deficiency 
score, and this we shall label the psychologi- 
cal constraint. In practice it is found that 
when subjects are asked to rate how much of 
a (desirable) characteristic is associated with 
their job and then to rate how much of that 
characteristic should be associated with their 
job, they rarely state that there should be less 
than there is. Porter’s research (1962) shows 
that deficiency scores are predominently posi- 
tive, while in two studies involving 450 sub- 
jects we found that only 5% of the respon- 
dents obtained negative deficiency scores. The 
Occurrence of this psychological constraint is 
relevant to the logical constraint in two re- 
Spects, First, it means that deficiency scores 
for those with high-perceived existing levels 
of a given job characteristic will fall within a 
More restricted range than they will for those 
with lower existing levels. Second, where in- 
Vestigators have tried to overcome the logical 
Constraint by considering only the magnitude 
of the deficiency score (ignoring whether it 15 
Positive or negative), the logical constraint 
Will in fact still operate. There will still be a 
Negative relationship between existing level 
Scores and deficiency scores. 

The following study illustrates the effects 


Of these two constraints and their relevance 


for interpreting deficiency score data. 


METHOD 
Subjects and Procedure 


Two empirical studies were carried out and in- 
volved 37 and 29 nurses, respectively, from six job 
levels. These studies were originally designed to eval- 
uate the relationship between job level and perceived 
deñciency in participation in decision making at the 
ward level. Each nurse answered the following two 
items: 


1. How much say do you have in decisions made 
at ward level? [existing participation]. 

2. How much say should you have in decisions 
made at ward level? [desired participation]. 


Each item was rated on a 7-point scale, and the 
deficiency score obtained by subtracting the score 
for Item 1 from that for Item 2. 

A simulated study involving random generation 
of desired participation scores was also carried out. 
The purpose of this study was to demonstrate that 
given the psychological and logical constraints, a 
relationship between deficiency scores and an inde- 
pendent variable is strongly determined by the rela- 
tionship of the independent variable with the exist- 
ing participation scores. The simulated study used 
job level data and existing participation scores from 
the empirical study. The desired participation scores 
from the empirical study were replaced by scores 
generated randomly. The psychological constraint 
was operationalized by making the randomly gen- 
erated score attributed to an individual equal to or 
greater than his existing participation score. The 
randomization procedure also ensured that for the 
same existing participation scores, the randomly pro- 
duced desired participation scores approximated a 
normal distribution.? Deficiency scores were calcu- 
lated by subtracting each subject’s present participa- 
tion score from the randomly generated desired par- 
ticipation score assigned to him. Four such simulated 
studies, for each of the two empirical studies, were 
carried out, and the average correlations across the 
runs are used as results. 


RESULTS 


Table 1 shows that in the first empirical 
study the subjects at higher job levels obtain 
lower scores for deficiency in participation at 
ward level (r = —.52). The relevant simu- 
lated studies, however, demonstrate that given 
only the psychological and logical constraints 
paia 

3 For existing participation scores of 6 and 7 on 
the 7-point scale, it was not possible to generate a 
normal distribution of desired-participation scores. 
For the existing scores of 6, the randomly generated 
desired scores of 6 and 7 were given an equal proba- 
bility of occurrence, while for existing scores of 7 
all the randomly generated scores, as the psychologi- 
cal constraint required, were 7. 
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he second stu 
our claim that 
sults may be masked | 
scores, This Study yie] 


; illustrates 
aningful re. 
Jy the use of deficiency 


elds a correlation between 
job level and deficiency jy Participation of 


21, which is nonsignificant jf evaluated 


against a zero correlation nul] hypothesis. The 
relevant simulated studies, 


that given the relationship be 
and existing participation 


and the two 
Straints, a correlation of 


m 


— EU 
level and deficiency in participation E the 
expected. Using this latter correlatio sett 
“correct” reference point, then, the bant (t 
correlation of .21 is statistically signi that in 
- 2.27, H< .05). We may now s have 
this Study, nurses at higher job g "e 
Sreater deficiency in participation fact, ad- 
level than do their subordinates. In fa 5 rte 
ditional interview material strongly eee. 
this conclusion, and it was this ape the 
which led us to question the validity o 
deficiency score in the first instance. 


Discussion 
An Alternative Method of Analysis 
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Clency scores on à variable such as Jo hip 0 
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Furthermore, the simulated studies d re- 
onstrated that Psychologically meaning ts in- 
sults may be masked by the constrain whi 
herent in deficiency scores. Howe ; illustrat- 
the simulated Studies have merit in mee 
ing the Problems, they represent an E more 
cumbersome procedure. A simpler eis in the 
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TABLE 2 


CORRELATIONS OF DIRECT SATISFACTION RATINGS AND OTHER OPERATIONALIZATIONS OF SATISFACTION 


—————— 


Would like 
minus 1s 


Should be 


Would like : : 
minus 1s 


Should be 


Job facet Ts now minus is minus is 
now now now. Is now | now. Is now 
partialed out partialed out 
— MORS EO S ESAME COR 
Self-esteem .58* — .30* —.27* .07 92° 
Opportunity for growth .61* — .38* — .46* 06 05 
Feeling of security .67* — .43* — 92" 15 .09 
Opportunity to get to know others 42* —.19* —.25* 12 09 
Freedom on the job 63* — .42* = 45" —.06 06 
79" —.51* —.57* 02 04 


Pay for job TE 


Note. N = 208. 
* 01. 


lent results, an assertion which may be dem- 


onstrated by holding existing level scores con- 
stant in the above two empirical studies. 
When this is done, the partial correlation be- 
tween job level and deficiency scores in Em- 
pirical Study 1 is .15 (t = 88 ns) and in Em- 
pirical Study 2 it is .41 (t = 2.29, p < .05). 
The analysis of deficiency scores offered by 
Werts and Linn (1970) leads to the conclu- 
sion that part correlation (where the influence 
of existing level scores is removed only from 
deficiency scores) is a more appropriate pro- 
cedure. The choice between partial and part 
correlation depends upon logical considera- 
tions based upon the data being analyzed. As 
a reasonable case can be made for arguing 
that job level itself may be a determinant of 
existing level scores, in the present data par- 
tial correlation seems the more justifiable 
technique. In the more general cases discussed 
by Werts and Linn (1970), part correlation 
is more likely to be applicable. However, both 
techniques produce very similar results with 
the present data: in Empirical Study 1 the 
part correlation between deficiency scores and 
job level is .09 and in Empirical Study 2 1t 15 


.40. 


A Further Illustration from a Recent 
Empirical Study 

The numbers in these studies are very 
Small and, even though confidence levels take 
the size of the sample into account, there are 
risks in basing conclusions on such samples. 
For this reason, it is felt that the argument 
can be usefully expanded by reanalyzing the 


findings of a recent study in which the sample 
size is more respectable (V = 208). Wanous 
and Lawler (1972) set out to explore the in- 
terrelationships among various ways of mea- 
suring job satisfaction. Several of the methods 
involved deficiency scores. Table 2 extracts 
from their Table 2 the correlations for existing 
level scores and two of their discrepancy 
scores with direct ratings of satisfaction for 
six job facets. We have added two columns 
that contain the partial correlations between 
direct satisfaction measure and the two defi- 
ciency scores (should be minus is now and 
would like minus is now) with present level 
(is now) held constant. As Table 2 shows, 
when present level is held constant, correla- 
tions are near to zero.* The relationships be- 
tween direct measures of satisfaction and the 
two deficiency scores clearly reflect little 
more than the relationship of the measure of 
satisfaction with the ratings of existing levels. 
Except in one instance (self-esteem) the 
“should be" and “would like" scores make 
little contribution to the relationships between 
deficiency scores and job satisfaction mea- 
sures. 

In instances where deficiency scores re- 
veal relationships above and bevond those 
determined by existing level scores, does this 
suggest they still have a value? The answer 
appears to be negative. Werts and Linn 


4 Where part correlations are computed, following 
the analysis of Werts and Linn (1970), and an 
equivalent pattern of results is obtained, all the part 
correlations are of less magnitude but with a differ- 


ence not exceeding .04. 
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(1970) have demonstrated that such Verg 
ships can be discovered by correlating an in * 
pendent variable with the desired score an 

holding the existing score constant. In other 
words, any relationship of a deficiency Score 
with an independent variable is due either. to 
one or both of its component parts being 
related to that independent variable. Quite 
simply, the deficiency score as traditionally 
operationalized is not more than the sum of its 
parts. Conceptually, however, it has been 
treated as such. Nevertheless, the conceptual 
logic behind the deficiency score has consid- 
erable appeal: It does imply a rational seli- 
assessment of feeling. The operationalization 
of the concept might be better achieved with 
an item such as *How much more would you 
like than you have now?" This Copes with 
the two constraints by allowing the existing 
level scores to be the anchor point. Tt also 
avoids the danger of attributing a meaning to 
the responses which may not be there: One 
would be less likely to conclude that the ques- 
tion is necessarily measuring Satisfaction, A]. 
ternatively, we could continue with the present 


format but allow our Subjects to do their own 
arithmetic, that is, 


How much is there now? 
How much would you like? 
And (having considered the 


i above two 
questions) how satisfied 


are you? 


Tt is just Possible that in 


of affect, seven minus five do 
equal two, Th, 


needs to be e 


the mathematics 
es not necessarily 
ese alternatives 


í Y, It has been shown, 
presenting origina] data and b 
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are no more than an attenuated phasis 
the relationships between existing nd 
and those independent variables. "v esci 
it has been demonstrated that e] A. 
Scores may mask otherwise Sereno her, 
tionships. Given that it is possible to ^n the 
by use of part or partial correlation, = 
relationships involving deficiency scor A 
using the raw scores alone, the calculat A 
deficiency scores is redundant. OE the 
tantly, however, this procedure avoi pt if 
danger of interpreting deficiency scores mt 
they were more than the sum of Lp ae 
parts. For these reasons, we strongly ud 
With the advice offered by Cronbach oF 
Furby (1970) that “deficiency,” bur V 
"gain" scores Should be avoided, and ra 

Scores only should be used. 
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EFFECTS OF A REALISTIC JOB 


PREVIEW ON JOB ACCEPTANCE, 


JOB ATTITUDES, AND JOB SURVIVAL’ 


JOHN P. WANOUS ? 


Graduate School of Business Administration, New York University 


A field experiment was conducted in a telephone company to assess the effects 
of a realistic job preview versus an unrealistic (ie., “traditional”) preview. 


Those who received the realistic job p 


job expectations, 
jn comparison to tho: 


review subsequently had more realistic 
fewer thoughts of quitting, and slightly higher job survival 
se given the traditional preview. There was no differ- 


ence in job acceptance rates between the two groups. The results are discussed 


in light of the general process of individuals 
for future research are offered. 


Several suggestions 


Most organizations are constantly involved 
in a “matching process” between individuals 
and the organization (Argyris, 1964). On the 
one hand, new recruits to an organization 
possess individual talents (such as skills, 
abilities, and knowledge) as well as important 
psychological needs. On the other hand, the 
typical organization can be viewed as having 
talent requirements for various jobs as well as 
its own particular climate characteristics. 
Thus, there are two important “match-ups” 
which occur during the process of new 
employees entering an organization: (a) 
individual talent with organizational talent 
requirements and (5) individual needs with 
the need-fulfilling characteristics of the job. 
This general process of matching the individ- 
ual and the organization is continuous because 
both people and jobs change over time and 
because there is a constant labor force move- 
ment such as hires, promotions; quits, and 
fires, 

Industrial psychologists traditionally have 
Studied the match between individual talent 
and organization requirements rather than the 
latter match-up. They have also tended to 
Study this primarily from the viewpoint of the 
Organization selecting the individual rather 


„` This article is based on an unpublished doctoral 
dissertation from the Department of Administrative 
Sciences, Yale University, 4972. The author would 
ike to thank Edward E. Lawler III, Benjamin 
Schneider, J, Richard Hackman, and Clayton P. 

derfer for their assistance- d 
W: Requests for reprints should be sent to John P. 
N.n0us, Graduate School of Business Administration, 
èw York University, 100 Trinity Place, New York, 


cw York 10006. 


joining new organizations. 


than the other way around. This focus 
typically included predictor and criterion 
development, a variety of statistical prediction 
procedures, and job analysis. 

More recently, other researchers of organi- 
zational behavior have expanded the scope of 
inquiry to include emphasis on the matching 
of individual needs and organizational char- 
acteristics. The present research is represent- 
ative of this general thrust. Other examples 
include the work of Schneider and Bartlett 
(1970) and Schneider (1972) relating indi- 
vidual work preferences and expectations to 
organizational climate, Berlew and Hall’s 
(1966) study of the socialization process, 
Litwin and Stringer’s (1968) research on 
achievement motivation and organizational 
climate, Friendlander and Margulies’s (1969) 
analysis of the impact of organizational 
climate and individual values on job satisfac- 
tion, the organizational choice studies of 
Vroom (1966) and Vroom and Deci (1971), 
the Minnesota studies of vocational adjust- 
ment (Lofquist & Dawis, 1969), and finally, 
a series of organizational climate investiga- 
tions reported in Tagiuri and Litwin (1968). 

Both correlational and experimental ap- 
proaches have been taken in studying the 
relationships between individual needs and 
organizational characteristics. Those doing 
correlational research in this area typically 
measure both individual and organizational 
characteristics, assess the resulting pattern of 
similarities and differences, and relate the 
pattern to such dependent variables as absen- 
teeism, turnover, performance, and job 
satisfaction. 
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Of the experimental studies, a very small 
group has specifically focused on the effects of 
job preview "treatments" on the subsequent 
job behavior of new organization members 
(Gomersall & Meyers, 1966; Macedonia, 
1969; Weitz, 1956; Youngberg, 1963). 
Methodologically, the typical study in this 
group compared two groups of candidates, 
one of which received a “dosage” of realistic 
preview information and the other which 
received no systematic job preview. The 
results of such an experimental comparison 
usually have been assessed in terms of sub- 
sequent turnover or job survival. A consistent 
finding has been higher job survival rates for 
those receiving realistic job previews. In addi- 
tion, Goversall and Meyers (1966) found that 
a realistic job preview appeared to increase 
subsequent job performance, Despite such 
practical benefits, this group of studies is not 
well known (only two of the fo 
published). Campbell (1971), in his recent 
review of “personnel training and develop- 
ment,” points out that job Previews certainly 


deserve further research, The present study is 
an effort in this direction, i 


Theoretically speaking, this r 
designed to shed light on the 
processes intervening between 


ur have been 


esearch was 
Psychological 
Job preview 


istic vs traditional) b a 
ut more Importa; 

nt] 

to gather data on the intervening psych. 

logical processes, ue 


realistic job pre 
"screening 
individuals, namely, 


likely to quit as a result of à poo 


T “match” 
Essentially 


NM H e 3 t Pi 
of a realistic job preview would be a 


individuals better match their own needs t, 
o 
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D 
the need-fulfilling characteristics of a part ! 
ular organizational climate. In this study, id | 
individuals preferences for a work envi 
ment? were measured and were used to € ! 
the effects of job previews on job acceptance. 

A second way in which job preview eat | 
may lead to higher job survival is by wr 
the chances for subsequent (i.e., ane 
“disillusionment” or “disenchantment. T f 
Seems reasonable because several ape. 
job satisfaction (see Wanous & Law Ki 
1972, for a review) state that individus 
make comparisons between different — 
of the work environment. For example, m is 
er's (1961) theory is that "should be 9) 
compared to "is now? and Locke's OMM 
view is that "would like? is compared uM. 
now." In addition to this, Wanous (19 ut 
found that an individual's expectations abo 
new occupations (as opposed to e 
zations) are only somewhat close to reali 7, 
which suggests that inaccurate initial expect a 
tions about the world of work may be nt 
somewhat general phenomenon. In the p 
research, initial job expectations were er 
ured after a job preview but before i 
acceptance. Comparisons were then ae 
between the two preview groups 1n ted 
attempt to assess how these previews affec 
initial expectations, job 

It is, of Course, quite possible that E. 
previews may affect both job acceptance * 
initial expectations, To assess whether jo. 
both of these processes occurred, the ab ces 
Procedures are summarized. In the first pa 
job preferences were measured, and a C? 
parison was made betw 
those who accepted or 
If the job Previews affe. 
screening device 
difference in job pr 


een the preference 
rejected the job p P 
ct job acceptance 7^ , 
then there shoul tw? 
eferences between the ^. b 
Preview groups. In the second place, - 
expectations were measured after j0P jews 
views and were compared. If the job Lt E 
affect initia] expectations, there should oth 
difference between the two groups. ^? 


M 

8 Preferences 
ere as the 
(Alderfey. 
call “hy 


ta 
for a work environment m oth 
empirical representation of a we 
; 1969, 1972; Maslow, 1943, 195 a 
man needs.” Schneider and Bartlett 


ha a 
ve also used work preferences as ™ 
uman needs, 


REALISTIC Jos PREVIEW: EFFECTS ON 


cases—preferences and expectations—the 
direction of the differences between the two 
preview groups should be consistent with the 
informational content of each job preview. 


METHOD 

conceived as an attempt 
to understand an individual's “job attitudes” from 
initial contact with an organization through early 
work experiences (three months). The study included 
both correlational data (the longitudinal measure- 
ment of various job attitude and behavior variables) 
and experimental data ‘the effects of a realistic job 
preview). Only the results of the job preview exper- 
iment are reported here. 

This research was conducted in the 
of an eastern telephone company, using à 
about 80 newly hired female telephone operators 
volunteered. Participation in the study began with an 
individual's initial organizational contact and termi- 
nated after three months’ work experience. After a 
job offer was made by the organization, but before it 
was accepted by the individual, each subject was 
randomly shown one or two job preview films about 
the telephone operator's job. One was a traditional 
recruiting film. previously used by the organization. 
The other was an experimental film containing much 
more realistic job preview information. The exper- 
imental film differed from the traditional film in two 
ways: (a) It contained both "good" and “bad” 
information about the job, in contrast to the mostly 
“good” information in the traditional film. (b) It 
was based on prior research in the same organization 
(sce Hackman & “Lawler, 1971). Both films were 15 
minutes long and actually shared in common about 
4 minutes of film time. 

'The Job Descriptive Index (JDI; Smith, Kendall, 
& Hulin, 1969) and the Minnesota Satisfaction Ques- 
tionnaire (MSQ), Short Form (Weiss, Dawis, Eng- 
land, & Lofquist, 1967) were modified slightly to 
measure both work preferences and job expectations. 
Although the item content of the JDI and MSQ was 
not changed, the instructions were changed. To 


measure psychological needs for work, subjects were 
" for each 


asked to think in terms of "preferences e 
item, To measure initial job expectations, subjects 
were asked to think in terms of “realistic expectations 
when I become an operator.” As would be expected, 
acceptable reliabilities were obtained. Using the 
Spearman-Brown formula, the reliabilities of the 
SQ were r = .80 for realistic expectations and 
r = 81 for preferences. The five scales of the JDI 
Averaged r = .79 for realistic expectations and r = 
‘74 for preferences. 
m Two further methodolog 
a First, the crucial differenc 
erie films concerned the greate! 
ealistic) content of the experimenta 
Parison to the traditional recruiting ; 
Eo films *work" to create initial job expectations 
exp terit with their respective content, initial 
*Pectations of those viewing the realistic film should 


This gescarch. project 


“field setting” 
sample of 
who 


ical points must be made 
e between the two 
r negative (but 
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be lower than those seeing the traditional one. 
Second, not every JDI scale or MSQ item pertained 
to expected differences between the films. Those 
scales or items whose content was either not included 
in the films or presented identically in both would 
not be expected to reflect differences between the 
two preview treatments. Only those scales or items 
whose content was related to job information 
presented differently in both films would be expected 
to register differences between the two films. Assess- 
ing which questionnaire scales or items corresponded 
to film content was done prior to data analysis by a 
group of graduate students and faculty members. 
Consensus on assessments was reached quite easily 
during group discussion. A written script for the 
entire 15-minute realistic film was used as an addi- 
tional data source aíter group discussion. There was 
consistency between both group consensus and the 
script. 


RESULTS 


Because an important purpose of this 
experiment was to analyze the dynamic effects 
of job previews, assessments were made of 
their impact on both job acceptance and 
initial expectations. The results indicate that 
the job previews had virtually no effect on job 
acceptance; that is, regardless of which film 
was seen, practically all subjects accepted the 
job offer. 

Because of randomization, there were no 
initial differences between the two groups of 
operators in their work preferences. Thus, 
when virtually all subjects (except for two) 
accepted a job offer, there were no resulting 
differences between the two groups. Had there 
been evidence of differential job acceptance 
after the preview treatment, it is possible that 
differences in work preferences might have 
appeared between the two groups. This could 
have happened if job acceptance decisions 
were in response to the content of each job 
preview. 

On the other hand, there were clear and 
significant differences in initial job expecta- 
tions between the two groups. Those viewing 
the realistic film had lower expectations in 
comparison to those seeing the traditional 
film. These differences were significant for 
those JDI scales and MSQ items which were 
related to the content of the two films. In 
addition, there were almost no significant 
differences between the two groups for those 
scales and items not related to film content. 
For example, the work (? < .03) and super- 
vision (p < .005) scales of the JDI showed 
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TABLE 1 
Joe DESCRIPTION Inventory: 4 
Jou EXPECTATIONS AFTER THE Jos Preview Fira = = 
isti iti Mean 
Job Realistic Traditional Rele- ah " r 
Description ba p 
Inventory ^ 
pele n E ee M. oar X | sp 
7 0. 
2.27 | p<: 
5 33 43.06 yes 4.33 : ‘008 
P $ [asl | 3 47.06 yes | 539 | 293 | p< 
Pei | 45 2007 | 4.59 | 32 | 1975 uz lus lam Moped 
3.Pay — 45 18.18 | 622 32 20.34 no 2.16 PE CA. 
Bu rA 45.78 | 884 | 33 | 4891 no | 343 | 1 
. VO-Wt 


d escriptii Inventory. w 
Vole. The Job Description Rory s 
ha were used for relevant scales because 


significant differences as predicted, Whereas 

e other three scales (pay, promotions, co- 
workers) did not, as was also expected. 
Of the 20 MSQ items, 9 were judged relevant 
and 11 were judged not relevant to differences 
between the films. Of the 9 relevant items, 7 
showed significant mean differences at the .05 
level, one at the .10 level, and the other in the 
predicted direction, Of the 11 nonrelevant 
items, only 1 registered a significant difference 
between the two groups. Tables 1 and 2 show 


the exact results for both the JDI and the 
MSQ. 


As found in previou 
to bea difference bet 


8 research, there tended 
Ween the two groups in 


scored using 1, 2, and 4 
lirectional prediction wa. 


ather than 0, 1, and 3 as 


, ^, One: 
is traditionally done. On 
made. T wo-| 


devant scales. 
tailed t tests were used for nonrelevan 


|| 
terms of job survival. Of the operators e: a 
saw the realistic film, 62% (n = 23) ee nice 
the job after three months of work ining? 
as compared to 50% (n = 17) of those eds 
viewed the traditional film. This anae 
not statistically significant (202572. A 
but is similar to four other field Minore 
in which consistent significant differences W 
found. of 
When self-reports of the number fter 
"thoughts of leaving the organization or 
one month on the job were compared paa 
two groups, those seeing the realistic film hts 
significantly fewer such thoughts. Tane 
of leaving were measured on a 5-point SC 


TABLE 2 


MINNESOTA S 


ATISFACTION QUESTIONNAIRE 
Jos EXPECTATIONS AFTER THE Jor PREVIEW Firs =A 
j Realistic Traditional 
Minnesota Satisfaction Questionnaire a Rele- 
Es = vance 
n X | $8 | « X SD 
Y Being able to Keep busy i i 
ait Sid Y all the time 

2. The chance to wert alone on the job FE Eh 

3, The chance to do qii ent things 2 EH 

4. The chance to be “so 0 E 

5 Supervisors handle emph well b 32 

$ Supervisor competent a ng decisions | 43 ER 

eing able to do things against my | E 3 
conscience BS not against my 40 

8. The job provides steady ¢ 31 | 413 

9. The chance to tell People wha tae *B 31 | 474 
10. The chance to do things for other People | 41 32 31 | 2.61 
|i; The chance to make us aj utt abilities | gy | $32 gh) Ase 
12. Good company policies 1229 31 | 452 
13. dd 30 | 437 
14, nce for advancement i y 30 | 457 
15. Freedom to use my own judgment 42 14 31 3.87 
16. The chance to use my own methods 42 26 31 3.68 
17. Good working conditions X3 | gee FEET 
18. Co-workers get along with each other 43 Kti 31] 255 

- Praise for doing a good job 41 68 31 4.32 
20. The feeling of accomplishment from the job| 43 79 3 | 339 
E i. d 4.23 


Note, One-tailed 1 tests were used for relevant items 
nonrelevant items, Ms becan 
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where 5 represented the extreme of no such 
thoughts. Those viewing the realistic film 
(n = 39) had a mean of 4.82 as compared to 
a mean of 4.14 for those seeing the traditional 
film (z = 29). This mean difference of .68 is 
significant at the .005 level, using a one-tailed 
t test. 


DISCUSSION 

As in other research, the realistic job pre- 
view was associated with higher job survival, 
although the difference was much weaker here. 
However, the main purpose here was to 
analyze why this seems to occur. In this 
study, it appears that the effects of a preview 
are primarily on initial expectations rather 
than on an individual's acceptance of a job. 
Should we then conclude that the impact ofa 
job preview on job survival is due more to 
realistic initial expectations than to job 
acceptance? Before drawing such a conclusion 
from these results, it should be pointed out 
that when an individual chooses an organi- 
zation, he is engaged in a process occurring 
over time, not in a single act. The present 
research may very likely have only involved 
the final stage of this process. In addition, it 
could be argued that because the preview 
occurred after personal effort was expended by 
an individual during testing, interviewing, 
etc., the effort expenditure enhanced an indi- 
vidual's view of the job (Lewis, 1965). Thus, 
the chances that an individual would reject a 
job offer based on the information contained 
in the job preview film may have been re- 
duced. A third possibility why differential job 
acceptance did not occur in this study is that 
labor market unemployment increased during 
the course of data collection. When this re- 
search began, unemployment rates for the 
geographical labor areas involved averaged 
about 5%. About four months had passed, 
this rate had climbed to about 69. During 
the final five months of data collection, un- 
employment was relatively stable, varying 
between 7.4% and 7.6% on the average. The 
Absence of alternative jobs probably reduced 
an individual's freedom to reject a job offer 
On the basis of a job preview. March and 
Simon’s (1958) theory and Parnes's (1954, 
1970) research reviews Support this expla- 
Nation, 
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Although there was an indication of greater 
job survival as a result of job preview realism, 
the result was not statistically significant. 
However, there is evidence that perhaps the 
difference between the two groups is a con- 
servative estimate of what might have hap- 
pened in other circumstances. For example, if 
differential job acceptance had occurred here, 
the difference in job survival rates could have 
been larger. This could have happened 
because better matches" of individual work 
preferences and organizational climate in the 
realistic preview group would probably have 
resulted in greater job survival for them as 
compared to the traditional preview group. 
A second possibility is that the greater num- 
ber of thoughts of leaving the organization 
might have resulted in greater losses for the 
traditional preview group if unemployment 
had not been so high. In either case, the effect 
would be to increase the difference in job 
survival rates between the two groups. 

What can be concluded from the exper- 
iment is that job previews “work” to create 
initial job expectations consistent with the 
information in the preview. Based on this and 
other studies, it appears that realistic initial 
job expectations are associated with higher 
job survival and more positive attitudes about 
staying on the job. Despite the present 
research effort, it is still not clear how job 
acceptance is influenced by job previews. 

Future research should be designed which 
will permit a more comprehensive assessment 
of the joint effects of job acceptance and 
initial expectations on job survival. Additional 
research. will probably provide further infor- 
mation about the dynamics of job previews 
as a component in the process of individuals? 
match-up with new organizations and their 
subsequent job behavior. 

Some still-to-be-answered questions are the 
following. First, future research could focus 
on the techniques of job previews (e.g. films, 
written material, speeches, interviews, and 
mass media) to assess which are most effective 
in relation to costs. A second issue for further 
research concerns the timing of a job preview 
in relation to other events in the entire re- 
cruitment-selection-placement process. The 
present research had the preview at a “late” 
stage and it had no effect on job acceptance. 
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Research should be aimed at the effects of 
an “earlier” job preview on the initial decision 
to seek out an organization as well as on the 
decision to accept a job offer. A third and 
crucial need is for an integrated conceptual 
view of the entire “joining-up” process, one 
which approaches this from both sides—from 
that of the individual and that of the orga- 
nization. One direction this effort might take 
is to integrate the relevant components of 
traditional industria] Psychology, vocational 


of) 
psychology, and the climate research of orga- 
nizational psychology, 
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ATTITUDE TOWARD INVASION OF PRIVACY IN THE 
PERSONNEL SELECTION PROCESS AND JOB 
APPLICANT DEMOGRAPHIC AND 
PERSONALITY CORRELATES * 


BERNARD L. ROSENBAUM ? 


Personnel Psychology Center, New York 


Attitude toward invasion of privacy 
the job applicant pop 


ulation was explored. 


in the selection process by subgroups of 


Attitude toward privacy invasion 


and personality correlates was also examined, Subjects consisted of 1,392 job 


applicants. Attitudes were m 
a 66-item instrument factor 
many of the correl: 
were significant, they w 
variance. Personality correlates s 
further investigation. 


ere not large 


The controversy surrounding personnel 
evaluation and invasion of privacy in the 
selection process still rages despite many ef- 
forts on the part of legislators, educators and 
psychologists to establish a set of guidelines. 
As Guion (1967) notes, the empirical litera- 
ture has been silent about the kinds of probing 
that people consider unwarranted intrusions, 
and it is apparent that the lack of opera- 
tional definitions of invasion of privacy are, 
at least in part, contributing to the problems 
encountered in attempting to set standards. 
Moreover, Barrett (1968) makes the point 
that "the profession and the public need 
guidelines to help determine whether a given 
piece of information, collected in a specified 
way, and used for various purposes by some 
known level of training and competence 1s 
or is not an invasion of privacy [P- 261]." 

Indeed, with the possibility that future 
legislation may restrict selection and evalua- 
lion. procedures, research. data which might 
help bring objectivity to the issue of privacy 
invasion is sorely needed. Therefore, this 
Study addresses itself to this problem by 
investigating the following: (4) attitude 


wy E TA 
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ations between demogra 
enough to account for much meaningful 
uggested a set of variables which might merit 


pu 


easured by the invasion of privacy questionnaire, 
analyzed into five component factors. Although 


phic and invasion of privacy factors 


toward invasion of privacy in the selection 
process among subgroups of the job applicant 
population and (b) attitude toward invasion 


of privacy in the selection process and 
personality correlates. 
METHOD 


Sample 


A total of 1,392 job applicants was surveyed in 
company personnel departments and on the premises 
of a New York City management consulting firm 
engaged in the evaluation of personnel for industry. 
A representative sample of companies was chosen 
which were diverse in type of work or product 
involved and in geographical location. The sample 
included companies engaged in manufacturing, pub- 
lishing, retailing, transportation, and banking. 

Of the 1,392 job applicants, 352 were administered 
personality tests in addition to the invasion of 
privacy questionnaire. The median age of this group 
was 31.8 years. The majority were married males 
with a college or graduate school education and with 
incomes of over $10,000 a year. Most were applying 
for sales and a variety of middle management 
positions. 


Invasion of Privacy Questionnaire 


The invasion of privacy questionnaire is an instru- 
ment that was developed specifically for this study 
in order to measure the degree to which subjects 
(job applicants) think their privacy is being invaded. 
Five dimensions of possible inquiry during the course 
of the employee selection process were studied. 

Before answering the questionnaire, the subjects 
were asked to read the instructions, which assured 
them of anonymity and informed them of the in- 
vasion of privacy issue as it relates to personnel 
selection. Subjects were asked if, assuming that some 
responsible person believes a particular topic to be 
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TABLE 1 


MEAN Scores or THE Five Irems HAVING HicHest 
LOADING 1N EACH or THE Five INVASION 
OF Privacy CLUSTERS 


Item Loading M 
Factor 1 
Church membership 70 1.89 
Frequency of church attendance 70 1.68 
Religious beliefs 68 1.97 
Racial or ethnic background .66 1.08 
Description of brothers and sisters .63 1.95 
Factor 2 
School degrees received 67 2.96 
Names of companies previously 
worked for 2.93 
Duties of past jobs 2.92 
Dates of previous employment 2.89 
Home telephone number 2.94 
—n| li. 
Factor 3 
a 
Business, community and social 
clubs 2.63 
Hobbies 5253 
Magazines regularly read 56 2.52 
Extracurricular activities while in * 
school 55 5 
255 
Interests (personal likes and 2.78 
lislikes 
dislikes) 2.72 
———— : L, 
Factor 4 
E-——— ACC N 
Loans being paid 7n | 
Rent or Mortgage paid a 1.84 
OW income is budgeted E. 1.82 
Savings -66 1.60 
Other soure I. 64 1.55 
E" Bn es of income 64 191 
Factor 5 
Relationships with supe AR 
ervis 
Use of habit-forming ia | 4 | 262 
Relationship with co-workers a 2.61 
Criminal record e 2.73 
Explanation of periods of un- a9 2.72 
employment 
>a 39 2.70 
SSS 


a useful source of Personnel infor, 


n mation, ct 

addressed to job applicants about that Fi 
constitute an invasion of their Privacy, The follo. ould 
is a typical item: wing 
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Birthplace of parents 
(a) Invasion of my privacy (should not be 
asked) 
(b) Undecided 5 
(c) Not an invasion of my privacy (can be 
asked) 


Weights from 1 to 3 were assigned to the eae 
choices for scoring purposes, 1 being in the E 
of privacy direction and 3 in the noninvasion A: 
privacy direction, Thus, the lower a subject's I 
is on any of the invasion of privacy scales, th 
greater is his feeling of privacy invasion. £23 

Development of items. A review was made o Mon. 
corporation application blanks, 5 published appli 
tion blanks (Bruce, 1959; Personnel Devcon 
Associates, 1962; Personnel Psychology Center, 196 d 
Sales Executive Club, 1955; Wonderlic, 1967), A 
5 interviewer guide books (Balinsky, 1959; Gashi 
Crissy, 1963; Fear, 1958; Lopez, 1965; Witkin, 1962) 
in order to generate a representative list of KOPAE 
inquired about during the course of the employmen 
selection process, A trial version of the invasion 0 
privacy questionnaire was administered to a PU 
of 20 employed Subjects. The group consisted of : 
males and 10 females of differing educational back- 
Erourds and job levels, Subjects completed the que. 
tionnaire and Were asked to critique it for clarity 9. 
instruction and items, This resulted in a “debugging , 
of the instrument and items. The final instrumen 
consisted of 66 items and was administered to the 
job applicant sample. , 

Factor analysis, Product-moment correlations bor 
computed among the 66 items yielding a pow 
matrix, The matrix of intercorrelations was submitte 
toa Principal-components factor analysis. UA 
lion of the latent roots of the full matrix revealed 
five roots in excess of one. A varimax rotation, e 
described by Harman (1967), yielded five inter- 
pretable orthogonal factors, 

Items were one of the five factors E 
at a particular item loaded highet 
n on any other and, second, pr 
30 or above on that mus 
nated either for loading belo d 


Y on more than one factor. It is pe 
“ns of items within clusters tended 5 
homogeneous, The mean scores for d 
9f the items in Factors 2, 3, and 5 were in the non, 
invasion of privacy direction and those in ac 
l and 4 were in the direction of privacy n 
It appears as though the factors are defined in Phe 
Dy Sensitivity level and in part by content, 
two having 4 mutually reinforcing effect. nily 
Definition of the five factors. Factor 1, fat o 
ackground and influences, accounted for 30.47 


it 

€ co) 5 ; al co 

sistenc, not factor variance and had an intermgealin’ 
Stency relia 


with religion 


be relatively 


bility coefficient of ,93. Items ding 
this and race had the largest loa " were 

Is factor, Other topics included in this factor udes 
relationships with family members, sexual attit 
politica] beliefs, and family history. 


Factor 2, personal history data, accounted ior 
23.2% of the common factor variance and had an 
internal consistency reliability coefficient of .87. Fac- 
tor 2 concerned such topics as age, marital status, 
number of children, educational history, and work 
history, 

Factor 3, interests and values, accounted for 19.346 
of the common factor variance and had an internal 
Consistency reliability coefficient of .86. Factor 3 
probed such areas as avocational interests, aspira- 
tions, job and school preferences, and economic inde- 
pendence, 

Factor 4, financial management data, accounted 

for 18.1% of the common factor variance and had 
an internal consistency reliability coefficient of .89. 
The items defining this factor clearly related to the 
Management of financial matters such as loans, rent, 
Savings, and insurance. 
Factor 5, social adjustment, accounted for 8.896 of 
the common factor variance and had an internal 
consistency reliability coefficient of .78. This factor 
was primarily concerned with defiant behavior in- 
volving drugs, crime, and alcohol, but also included 
psychiatric history, periods of unemployment, and 
relationships with people on the job. 


Treatment of the Data 


The statistical procedure utilized in investigating 
relationships between invasion of privacy cluster 
scores and subgroups of the job applicant population 
was correlation analysis. Product-moment correlation 
coefficients were also employed in investigating rela- 
tionships between personality variables and invasion 
of privacy cluster scores. 


RESULTS 


Attitude toward Invasion of Privacy in the 
Selection Process among Subgroups of the 
Job Applicant Population 


Table 1 shows the mean scores for each of 
the items in the five invasion of privacy clus- 
ters. It should be recalled that mean scores 
can range from 1 to 3, 1 being in the invasion 
9f privacy direction and 3 in the noninvasion 
9f privacy direction. Inspection of the table 
Indicates that, in general, the job applicants 
Saw inquiry into the management of one's 
finances (Factor 4) and family background 
Questions (Factor 1) as being the most sensi- 
live insofar as privacy invasion is concerned. 
a 2, 3, and 5 (personal history € 

sts and values, and social adjustment, 
"espectively) cover topics about which appli- 
cants were less concerned when it came to 


Dri A 
lVacy invasion, 
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TABLE 2 


INTERCORRELATIONS OF Jon APPLICANT DEMOGRAPHIC 
Groups AND Five Invasion OF PRIVACY 
CLUSTERS 


Demographic groups 


Cluster Age Rural/ 


Income 
back- Level 
ground 


—.10** 
.05* 
Aree 
.06* 
.0L 


Table 2 shows the intercorrelations of the 
job applicant demographic groups and the five 
invasion of privacy clusters. Although many 
of the correlations between demographic and 
invasion of privacy factors are significant, 
they are not large enough to account for much 
meaningful variance, indicating that resem- 
blances across groups were high. 

However, the correlations do suggest 
trends. Older applicants were less concerned 
about privacy invasion when questioned about 
topics concerning interests and values and 
social adjustment. Females were more con- 
cerned than males about topics involving per- 
sonal history data, interests and values, and 
social adjustment and were less concerned 
than males about topics involving the man- 
agement of one's finances. 

As education increased, so did one's will- 
ingness to respond to questions concerning 
personal history data and interests and values. 
Job applicants with lower levels of education 
thought that questions concerning finances 
were more acceptable than did applicants 
with higher levels of education. 

Applicants from urban backgrounds thought 
that inquiry into family background was 
more of an invasion of privacy than did 
applicants from rural backgrounds. On the 
other hand, urban background applicants 
thought that the questions that address them- 
selves to one's interests and values and fi- 
nances were less of an invasion than did the 


rural background applicants. 
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TABLE 3 


VT GUILFORD-Z ERMAN T 
k SLATION OF Jor APPLICANT GUILFORD-ZIMME R 
uber INVASION or Privacy C1 


SCORES WITIL 
EMPERAMENT SURVEY SCORES W 
STER SCORE; 


ER Factor 
— — T 
Trait á | 3 4 | » 
T | -—-— Aà* 
5 03 | 10° 03 | (04 
aem 07 sy | z —.06 
General activity 5s —.0 | .05 10° 09 
estraint - | x )* p i* 
Restrain . 06 | —.01 aoe | 11* D 
Ascendance 12 | 02 | 12 p 09 E 
Sociability - d | 05 | 16** pte .18* 
Emotional stability E^ -Do | 20** = 08 
a E * E 
ied mE WEaMEs! 
es i fulness 00 EG | » m | nn | T 
T dien qc T e | de | .J4** 24 jones AS 
Personal relations TE 14** 07 — 
Masculinity ~O1 i : i TE 
DU ape 
pe. 


The higher the applicant’s income, the more 
favorable his attitude was to item: 
personal history data 
and social adjustment. 
but slight trend was 
direction for items deal 


S concerning 
; interests and values, 
However, a significant, 
Noted in the Opposite 
ling with finances, 


Attitude toward Inva 


Sion of Privacy in the 
Selection Process an 


d Personality Correlates 
Of the 1,392 job a 


ere ad- 
ministered the Guilford-Zimmerman Tempera- 
ment Survey (1949) and the E 


Sonal Preferen 
reports the co 
ford-Zimmerm 


ce Schedule ( 
trelation of job 
an scores with i 


1954). Table 3 
applicant Guil- 
nvasion of pri. 
noted that sig. 


: B ity, A 
tional Stability, Objectivity, and Thoughtful- 
ness with two facto i 

one factor. is i 


straint, correlated si antly with Factor 
3, interests and values, 

Table 4 reports the Correlation 
cant Edwards Personal Preference Schedule 
Scores with invasion of Privacy 


cluster Scores, 
Here it can be seen that Dominance Corre- 


of job appli- 


five 

lated significantly with four out of be an 
invasion of privacy factors, rn 
Intraception with two factors, and 
and Abasement with one factor. too 10W 

All of the correlations recorded are ng con 
to be of any practical value in offerir nature: 
clusions of a firm or persa conis 
However, the fact that a pattern o relating 
tency emerged for some scales in og fiv 
Significantly with all or nearly all o that f 
invasion of privacy factors SURES Tes 0 
ture research dealing with other ‘varia 
Sets of variables may be worthwhile. 


Discusston 


a -relations 
Despite the fact that the oia privat 
tween demographic and invasion 0 f 


t 
account, — 
factors were not large enough to acc à 


indic? 

much meaningful variance, there are a in 
tions of job applicant subgroup dieren er 
attitude toward the privacy issue he n 
sonnel selection, This study suggests jlex ne 
issue of invasion of privacy is a a sw 
that may not lend itself to broad ndn 
ing lega] restrictions, Some support W h 
for the Office of Science and Tec 


Statement (1967) that 
any 


e 
ain Gh 
specific An 

Eeneral injunction against study of of pri Sic 
of behavior wholly misses the essence 0” reve? "m 
fails to Protect some people from bg I E 
Ways that are Most upsetting to them wh! for™ 


jaa] in 
Others who are quite willing to reveal 
Ip. 12]. 


ATTITUDE TOWARD INVASION OF PRIVACY IN PERSONNEL SELECTION 


we 
ve 
M 


TABLE 4 


CORRELATION OF JOB APPLICANT EDWARDS PERSONAL PREFERENCE SCHEDULE SCORES WITH 
INVASION OF Privacy CLUSTER SCORES 


Factor 
Personality variable |- — m — I. 
1 2 3 4 5 
Achievement .10* —.03 00 —.10* —04 
Deference 08 —.03 .07 07 “L* 
Order —.04 .02 04 —03 01 
Exhibition .00 .02 09 04 ‘00 
Autonomy —.02 04 —.02 .00 — 08 
Affiliation —.01 —.04 —.01 —.03 ‘01 
Intraception .06 .09* -00 .02 d3* 
Succorance —.02 —.05 —.05 —.08 —05 
Dominance 11* 01 .13** “16** 10 
Abasement K —.03 —.03 .06 06 
Nurturance —.01 .02 —.04 -00 .01 
Change —.0t —.02 —.01 —.05 | —.08 
Endurance —.07 —.04 .00 8 | .00 
Heterosexuality —.03 03 —.04 — 04 —.08 
Aggression | 00 00 —02 07 00 


* p «.05. 
* p €.0l. 


In addition, the findings related to the 
Personal Relations scale of the Guilford-Zim- 
merman may mean that cooperativeness on 
the one hand and criticalness and intolerance 
on the other, as manifest in the temperament 
makeup of an individual, have a bearing on 
the extent to which one is apt to view the 
Whole issue of personal questioning in the 
employee selection process as an invasion of 
Privacy, Clearly, though, other measures of 
this construct, both objective and subjective 
n nature, will be needed in order to test this 
Notion, 

Furthermore, the combination of findings 
that reported significant correlations for both 
the Sociability and Personal Relations scales 
9f the Guilford-Zimmerman (1949) and the 

ominance scale of the Edwards (1954) sug- 
Bests the possible existence of a set of “per- 
Sonal impact” variables which may be able to 
Predict attitude toward privacy invasion. In 
E Context, personal impact variables = 
Which to refer to those temperament p 
vidual relate to the type of image an anal 
telati, projects of himself in his interpers m 

lew Snships. For instance, In a selection in pd 
COtive ae the individual who knowingly 
YS an air of outgoingness, cooperation, 


and forcefulness in presenting himself and in 
answering the questions posed to him may 
feel less threatened by the possibility of hav- 
ing his responses misinterpreted and may, 
therefore, be less likely to view the inter- 
viewer’s probes as an invasion of privacy. On 
the other hand, a person who is by nature 
more shy and retiring, critical of others, and 
not very assertive in getting his point across, 
may feel threatened in this situation by his 
lack of skill in communicating information 
about himself. Under these circumstances, the 
latter individual may very well consider that 
his privacy is being invaded. In any event, 
further investigation will be needed to sub- 
stantiate either of these viewpoints. 


REFERENCES 


Batrnsky, B. The executive interview: A bridge to 
people. New York: Harper, 1959, 

Barrett, R. S. Review of A. F. Westin, Privacy and 
freedom. Personnel Psychology, 1968, 21, 261-264. 

Bruce, M. Personal history audit. New Rochelle, 
N.Y., 1959. 

Casi, H., & Crissy, W. Tools of personnel selection. 
New York: Personnel Development Associates, 
1963. 

Epwarps, A. L. Edwards Personal Preference Sched- 
ule. New York: Psychological Corporation, 1954. 


338 BERNARD L. 


Frag, R. A. The evaluation interview. New York: 
McGraw-Hill, 1958. 

Guirrogp, J. P., & ZIMMERMAN, W. S. The Guil- 
ford-Zimmerman Temperament Survey. Beverly 
Hills, Calif.: Sheridan Supply Company, 1949. 

Guion, R. M. Personnel selection. Annual Review 
oj Psychology, 1967, 18, 208-209. 

Harman, H. H. Modern factor analysis, (2nd ed.) 
Chicago: University of Chicago Press, 1967. 

Lopez, F, Personnel interviewing: Theory and prac- 
tice. New York: McGraw-Hill, 1965. 

OFFICE OF SCIENCE AND TECHNOLOGY. Privacy and 
behavioral research. Washington, D.C.: U.S. Gov- 
ernment Printing Office, 1967, 


ROSENBAUM 


" 
PERSONNEL DEVELOPMENT Associates. Personnel Hi 
tory Form, Flushing, N.Y.: Author, 1962. 


, A 1- 
PERSONNEL PSYCHOLOGY CENTER, Qualification Sum 
mary, New York: Author, 1964, 


H H Jew | 
Sates Executive CLums. Salesman application. Ne 


York: Author, 1955. 4 
WITKIN, A. A. A business executive’s guide to inter- 

viewing. New York: Personnel Psychology Center, 

1962, d 
Wonpertic, E. F. Personnel application: Standar 


employment form. Northfield, Ill.: E. F. Wonder- 
lic, 1967. 


(Received July 24, 1972) 


0 eM RERREEES 


Journal of Applied Psycholog: 
1973, Vol. 58, No. 3, 350-346 


EFFECTS OF CURTAILMENT ON AN ADMISSIONS MODEL 
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Two admissions models are developed to predict future academic performance 
of graduate management students, The first model is based on 40 students 
who were admitted and enrolled in the program and is uncorrected for 
curtailment, The second model is developed from the total applicant popula- 
tion of 222 students after curtailment correction, The corrected model demon- 
strates higher predictive validity than the uncorrected model for two future 


class 


the beta weights, validities of predictors, and the total amount of v l 
A factor analysis and an analysis of admi: 
for the curtailment-corrected model in 


explained by the models. 


decisions offer additional support 


s of students. Furthermore, different predictors enter each model, affecting 


nce 
ons 


selecting students with high academic potential. 


E: he demand for graduate management edu- 
lon has increased tremendously in recent 
im A management school faced with lim- 
cce, €sources and more applicants than it can 
e Pt must decide on a strategy to select 
ae promising talent from information 
qui prior to entry. 
Ds here has been a growing interest among 
Ychologists to develop more precise models 
kee = mechanical combination of variables 
Bos" eg., Einhorn, 1972; Sawyer, ex 
jud ver reviewed research using both clinica 
Ement and actuarial models and concluded 
E the actuarial models are superior in pre- 
vided. Power. Recently, Dawes (1971) pro- 
evidence that an actuarial admissions 
clint Outpredicted the decision maker’s 
Fee assessment of potential graduate stu- 
busin, Yet, in spite of this evidence, most 
in 085 school admissions committees persist 
their Al & overall clinical judgments to make 
Mu ma decisions (Page & West, 1969). 
ep cà of the research on predicting gradu- 
marias ormance has been reviewed and sum- 
ing sed by Harrell (1961), Educational Test- 
Meee (1966), and Mehrabian (1969). 
er, the studies reported in these reviews 
" from a serious methodological deficiency 


1 

Sy Requests for reprints should be sent fo V. 

Versity a Graduate School of Management, Uni- 
2 of Rochester, New York 14627. 

the ans did not draw this conclusion although 

of E. à analysis clearly demonstrated the superiority 

© actuarial model. Sec Weinstein (1972) for 
discussion on this point. 


because they were based only on samples of 
students who had already been admitted to 
graduate programs. Since these students were 
a "select group,” any model developed from 
data obtained from this group may not be 
applicable to the entire pool of those who seek 
entry. The extent to which these models are 
distorted depends upon the degree to which 
the admitted group differs from the applicant 
group on variables that are considered in 
developing the model. This has been discussed 
in the psychological measurement literature 
as the curtailment or restriction of range 
problem (Lord & Novick, 1968; Thorndike, 
1949). Fortunately, it is possible to statisti- 
cally correct for curtailment provided the 
standard deviations and intercorrelations of 
the predictors are known for the applicant 
population in addition to the admitted sample. 

The curtailment correction assumes a linear 
extension of the relationship between the 
predictors explicitly used to select students 
and criterion data of those admitted into the 
program. The correction formulas allow the 
decision maker to estimate what the criterion 
score would have been if the person had en- 
rolled. In reviewing the curtailment literature, 
Lord and Novick (1968) reported studies 
which concluded that corrected correlations are 
more accurate than uncorrected correlations. 

The purpose of the research reported here 
is to develop and validate a graduate student 
selection model corrected for curtailment bias. 
The model is developed from applicant infor- 
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TABLE 1 


PREDICTOR VARIABLES USED IN DEVELOPING 
ADMISSIONS MODEL 


d 
mation and records of both prs 
rejected students at the Graduate Sc = xr. 
Industrial Administration at Carnegie- 


University. 
Selection variables (predictors used by 
admissions committee) 


METHOD 
1. Admission Test for Graduate Study in Business 


Sample 
score (Quantitative) 


2. Admission Test for Graduate Study in Business 


H inistration 
he Graduate School of Industrial Administra 
score (Total) 


i ` alytica 
is a two-year program which rphs AT i 
i i ticular, quantitative) approaches Resa 
i y School index* (Total) (in pari , k ve enginee 
x Ee ee alee Ke School index (Verbal) ment. Over 80% of Pu pen iw p ds arte 
[osi s : Todes ita- science, or mathematical bac grounds. work 
Ss jp Excellence by School index (Quantita. age, an entering student has had ane year of 9 
uy. AT i Pg i je 
6 (s ernn grade point average ed B cig ES e Te EQ and 1969 were 
LU duate grade point average (science € graduating clas i is study: TP 
8 Lee pl oe point average (engineering) chosen as the target population for 93 eus this 
9. Number of months of work experience total number of applications receive jon was cont: al 
10. Age period was 384. This applicant population rican stu- | 
il. Marital status (married, single)® posed of three different groups: (a) veg = 238) 
12. Rating of offices held dents who received a baccalaureate degree “tied di- 
13. Rating of scholastic recognition -— (b) Carnegie-Mellon students who PPP pletion 
14. Rating of involvement in social organizations rectly to the graduate program upon c y (6 
ing of involvement in sports activity of their junior year in college (n = 12), a 
15. Rating Der dm s J y 
16, Number of sports activities participated in 


ivi foreign students (n= 
17. Weighted sports activities (product of 15 and 16) 


; deals 
114). The present study d 
only with the first gr 


ze was 
oup. This sample of = with 
further reduced to an applicant sample of 22 jeted 
the elimination of $ students who never comp. ial 
the program and 28 students who had missing 
1. Admission Test for Graduate Study in Business 
score (Verbal) 


Nonselection variables (other predictors) 


7 
Of those who did not complete the program, e. 
3 had been asked to leave because of poor PCr. | 
2. Undergraduate grade point average (freshman) mance. The remainder left voluntarily, were da 
3 Undergraduate grade point average (sophomore) ior military service, or were accepted directly " " 
4 ae grade point average (mathematics the doctor program before completing the oem 
and statistics J mi 
7 i : degree. The 222 applicants were further C ject 
5. ie ae grade point average (computer into those admitted (n= 120) and mor E de- 
6. Undergraduate grade point average (management) m 192). OF the 120 apenan — earlier; or 
Undergraduate grade point average (economi ane to nnroll. However, as xiu. and anot? 
Undergraduate grade point average (humanitiesand — 3 dents did not fn. 2 DENIS 
social sciences excluding economics) 
Number of 


ts 
studen 
3 had missing predictor data so that only 40 st y 
9. re i 
years since last attended s ch 
10. Weighted : led school 


ical 

statistic 

had both predictor and criterion data. " SU mail 

e $ eae = à the sefer” 

work experience (by length and relevance, analysis et eee dU ay a à r 

to management i relevance ing admitted students revealed no significant 
dT. Excess number of years as undergraduate 
12. ating of letters of rec 


t5 
3 x uden 
ences in the predictors, Consequently, the 40 stuc d 
commendation 


> adm? 
can be considered an unbiased sample of the à m. le 
tk (by length and relevance ‘Students and will be denoted 


as the selected 5€ 


Criterion ro 
1B Undergraduate eu major (engineering) Previous studies have chosen criteria rangin’ int? 
statistics)» ge major (mathematics and first-year grades in Eraduate school to ve) ad 
17 Undergraduate college major (science) ee competence, In the side ae be 
naa na mem TÉ pae i ae a TR Ln 
20. Undergraduate college es (management) The reliability sind iniportance of grade point a hy 
sciences excluding Bono: umanities and social as a criterion was supported by the fact Eus md: 
~ — Graduate School of Industrial Administration: cee 
Note, Undergraduate grade point average (senior) n 
available in most cases since 


E 

$ i ^ dig later 3 

ce admissions degit iP was nop Point average was significantly related to in pres® 

prior to completion of the senior year ONS decisions were made Management (Weinstein & Srinivasan, 
^ This ind tables compiled by h 

e on the average Admis by the Educational 

ess score of all stud 


Predictoy 

dents taking (he /9*. Graduate ictors 

idergraduate school, aking the test from the 

diclors were treated as dummy variables Table 1 jis 
Study, I 


dex is taken from 
Testing Serv 
Study in Busini 
candidate's un 
b These pre 


H 

i sec 

ts the predictor variables ^ th 
n order to qualify as a. predictor; 


E 
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mati = 5 A 
nation must hav for the applicant population was necessary. Denoting 


| 


the: stude e been known prior to the time 
Most E was admitted or rejected by the school. 
canser the predictors were taken directly from 
Rie’ and application However, relevant in- 
aren about several variables existed in either 
descri i or descriptive form and measures of these 
n ions needed to be developed. These are 
cussed below. 
ko oturanlar activities, Four independent scales 
re developed: 


Ws ronds scale of scholastic recognition. This 
ary e awarded scholarships, membership in honor- 
2 Rca Dean’s List, etc. : us 
A high 9-point scale of offices held in organizations. 
ship cr rating was given to those who held leader- 
3 oss in a broad range of organizations. 
ment sint scale based on the extent of involve- 
ne "d breadth of social-organizational activities. 
ties rs *?-point scale of involvement in sports activi- 
o colleen from casual playing 0! athletic games 
lege level participation. 


red of recommendation. This rating, on 3 
sistency scale, was based on the strength and con- 
mentio of recommendations as well as explicit 
eadershi of certain qualities such as scholarship. 
skills. T. potential, motivation, and interpersonal 
ese wo graduate students trained to identify 
ent] qualities rated a sample of 30 cases indepen- 
Y. The interrater reliability for these cases was 
experience. For both work and previous graduate 
ience, weighted scales Were developed as prod- 

f (a) the length of experience (months) and 
nce to management 


verdes and course content areas. All undergraduate 
Scale pee averages were computed on a 4-point 
nee A=4, B=3, C = 2, D=1, and Fail- 
tent à for each course. In assigning courses to con- 
example à several decision rules were adopted. For 
Was RUE course titled mathematical poyani, 
han pone under mathematics and statistics rat a 
Such à imanities and social sciences. The reason or 
izeq s aipania was to emphasize the skills uti- 
Ourse 5 the nature of the content covered in a 

, but not the orientation of the course. Course 


scripti 
Possit jus were used to guide judgments whenever 
" e. 


Py, 
Ocedur, — . 
edure for Curtailment Correction 


I 
Perfor veloping a model to predict the academic 
With Rains of applicants, several nonlinear models 

er, hers terms and interactions Were tested; how- 
S re was no significant. difference between any 
Measure yart models and a linear model, as 
Potenti; by the F test. Since there were about 30 
*Dwis predictors of academic performance, a 
th © multiple regression model was used and only 


Se " e 
Predictors that added significant incremental 


validi 

it i j 

f d Were retained. To do this, me variance- 
the criterion 


T 
ance matrix of the predictors and 


by P the set of predictors and by C the criterion? 
the variance-covariance matrix can be written as 


P C 
———À4 
Predictors un | Ver | Vee | 
Criterion (C) | Ver | Vee | 
alii d 


The variance-covariance matrix Ver of the predictors 
can be easily estimated since the predictors are 
known for the applicant sample. However, the cri- 
terion values are not known for the applicants who 
did not enroll in the program. Consequently the 
entries Vre (or Ver) and Vee cannot be directly 
computed. Curtailment correction offers a method 
for indirectly estimating these entries. 

The admissions committee does not, in general, 
consider all the predictors listed in Table 1 in select- 
ing students. A list of all the predictors was pre- 
sented to the admissions officers and they were asked 
to check those variables which thev mainly u:ed in 
selecting students for the 1968-1969 classes. The 
subset of the predictors used by the admissions 
committee is referred to as selection variables (S) 
with remaining predictors called nonselection vari- 
ables (N)-5 Now the variance-covariance matrix of 
the applicant population can be rewritten as 


S N [4 
Selection variables (S) Vss | Vsx | Vsc 
Nonselection variables (N) Vys | Vey | Vye 


ann LLLA 


Vee 


Criterion (C) Ves Ver | 


The procedure for curtailment correction assumes 
that the multiple regression equation relating the 
criterion to the selection variables is identical for 
both the selected and the applicant populations. It is 
also assumed that the error in predicting the cri- 
terion is the same for all values of the selection vari- 


3 Here P denotes @ vector of (say) 30 predictors 
Py Ps Py, whereas C denotes the criterion 
(graduate grade point average). i ; 

4 The admissions committee was unsure in desig- 
nating four of the predictors as selection variables. 
The curtailment corrections rep 


orted later in this 
paper consider these four variables as selection vari- 
ables. However, @ sensi 


tivity anal sis was performed 
by classifying these four as nonse 


riables and 
the data were reanalyzed. The stepwise regression 
results were not significantly different for the two 
analyses. 3 

5 Rydberg (1963) called these two sets of variables 
“directly biased” and indirectly biased" variables 
whereas Thorndike (1949) and Lord and Novick 
(1968) denoted them as “explicit” and “incidental” 
selection variables. 
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TABL 
SrEPWISE REGRESSION MODELS Basep on 


ES 
SELECTED SAMPLE AND APPLICANT Pop 


^ z 
V. SRINIVASAN AND ALAN G. W EINSTEIN 


E2 


H jon? 
"1 Applicant population! 
Selected t^ c (n = 222) 
(omnes (corrected for curtailment 
Variable 
Order of 8 * Order of B r 
appearance appearance 
EN de 
Undergraduate college major ie dd r ogre NEL 
(engineering) 1 1 E MM 
Admission Test for Graduate a , " 257008 .56** 
Study in Business (Verbal) D 379 24 K 
Jndergraduate grade oint à a2 
BÉ E i 3 5234+ 31 6 269"" 3i 
Candidate Excellence by School 46" 
index (Verbal) á 4 4136" 49 2 A get 
Rating of offices held 5 Ao Ed 3 pos gore’ 
Weighted graduate work 6 319 .20 7 F — o 
Weighted sports activities 7 —.190 —.01 
Undergraduate grade point | spe 
nies (economics) :39t* 1 oor 4 
correlation 
Note. Order of appearance refers which a variable entered the stepwise re gression, The y refers to the corre 
cent with the cri average 
è R (adjusted) 
*p <.10, 
Hb <105 
> <101 


cc using the variance- 
ple Population 


out earlier, Previous resear, 
area had always used the data of the select, 


ch in this 
tion (i.e., the matrix W) and 
—— 


ed popula- 
assumed that the same 


Matrix, for exam i ; 
g ple, in on 
partial 3 e analysis Some of the 


à than one, Such a 
€ eleme e 
acis dis mis Vax and Vss are 


ai e applicant sample, 

formulas. To re edy this situatia ne the correction 
S sit " 

also obtained Using uation Vg, nd Vy Were 


Berg, 1963), E the fear rection formulas 
available for NOnselection yan 

that V satisfies the ecean Dee: T 
ance-covariance Matrix. In 
vided an Opportunity to e 
the correction procedures, 
Vs. obtained using the 
compared with the corr 
from the data of the aj 
responding correlation 
different (5 < 05) in 
Supporting the accurac 


ubmatrices Vey and 
Correction form 


€spondin, 
Pplicant 
Coefficients e 
only 8% of the 


results hold for the applic 
The matrix W will be refe 
covariance m, 


; well. 
ant population as We 


RESULTS Anp Discussion K 
. ‘ WIS 
A comparison of the results of the step" 


i À rmi- 
, Stepwise regression was tern" 
nated at a sta 


ie 
i Ee when by .adding gn ae 
tional predictor the incremental validity 


5 
not significant, "OT comparison pupie 
the same number of steps was carried Di 
the curtailed Sample. The two lists of : a 

ictors are different, indicating the Cn o 
the restriction of Tange problem on the M 
Variables that appear in the model. Tanal 
ond most important predictor (undergra®! p- 
Brade point average, economics) in € 
l did not enter the cur son 
. In addition, the beta weights p A 
of the Predictors are quite different a 
two Models, Finally, the amount of va" 


ile 
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TABL 
AND INTERCORRELATIONS OF THE PREDICTORS 
(D CRITERION) BEFORE AND AFTER CURTAILMENT 


MEANS, STANDARD DEVIATIONS, 
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E 3 


O 
Item 1 2 | 3 | 4 | 5 | 6 | 7 | 8 
Correlation coefficients 
L Unde BRENT n 
ergraduate grade point average 
2. ¢ i —.04 13% 35 age see .03 59%% 
M : Hence by School pee 1499 
3. Ra E —.05 sie 1d —.12* En 4e 
d dnd of offices held E dü 07 7 12* —44 28 
in Basin Test for Graduate Study 
5. Unde 06 a0 =g .10* 11* .06 56" 
unio ^ duate grade point average 
7. Wei nr) siete — pe n —.07 ao —gee nm 
8. Granit graduate work 05 10 =210 WaeeÉ 22 Joe 
ate grade point average 30%" 19 24 peer OL 20 
o Means 
Dm ev — - $ D 
After oCottection 1.01 ER 3.06 1.10 6.3 
orrection 192 162 2.89 1.74 6.21 
E pass 
fore — p 
After , Correction 55 .80 54 2.31 59 
Orrection ‘63 Ao «56 2.18 78 


tions reported below the main diagonal der 
ter correction (r = 222). 


Accoun: 
differs 


— 48% and corrected model, R? = 76%. 


ted for in graduate grade point average 
in the two models: curtailed model, 


— 77 a 


Dj 
eat Schrader, Smith, and Winterbottom 
Missi reported regression models using Ad- 
on Test for Graduate Study in Business 


Score, 
s 
ude, An undergraduate grades and con- 


alth 


S coeficients, there is excellent reason to be- 
Ste at dts effects on regression equations .. 
p atively small. Thus, the regression equation 
D. Tyg, Used with considerable confidence 
Thi 


l . 
var, position may hold if the set of selection 
v = 15 identical to the set of predictors: 
P te It is not generalizable to a situation 
Dre à nonselection variables are also. used 
that oe This is because the variables 
fait = the regression may differ when cur- 
me avin Correction is applied. Any alteration 
à NU sut will, in addition, affect the regres- 
ave į Bhts. Research in selection appears to 
Pored this critical problem. 


aote those before correction (» = 40); the entries 


above the diagonal 


The means, standard deviations, and inter- 
correlations of predictors before and after 
curtailment correction are reported in Table 
3. The values reported in this table are for 
the predictors that entered the corrected 
model. The intercorrelations of the predictors, 
after correction, are appreciably different 
from those before correction. However, more 
important are the changes in correlations 
with the criterion (graduate grade point 
average). For the Admission Test for Gradu- 
ate Study in Business Verbal score, Candidate 
Excellence by School index Verbal score, and 
undergraduate grade point average (econom- 
ics and junior year), there were substantial 
improvements in correlations with graduate 
grade point average after correction. This evi- 
dence supports the previous assertion. that 
the applicant population is curtailed on sev- 
eral important variables. 

In order to test the predictive validity of 
each model, data were collected for all stu- 
dents enrolled in the classes of 1970-1971. Ap- 
plying the curtailment-corrected model to 
predict graduate point average of these stu- 
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dents yields an R = .36, compared to R = 20 
for the uncorrected model. Thus, the corrected 
model demonstrates higher predictive validity 
than the uncorrected model, 

It is important to note that the curtailment- 
corrected model shrunk from R — 86 to 
R= .58 when applied to the original 1968- 
1969 enrolled sample. Since the uncorrected 
model was developed on the enrolled 1968- 
1969 sample, it cannot shrink (R = .69). The 
relatixe drop in predictive validity when both 
models are applied to the 1970-1971 sample 
offers additional support for the 
the corrected model over time. 

A better qualitative understanding of the 
predictor variables was obtained from a factor 
analysis with varimax rotation performed on 
the curtailment-corrected variance-covariance 
matrix. Nine factors accounted for 79% of 


A curtailment- 
model of these 
ade point aver- 


TABLE 4 


LYSIS AND FACTOR P 
T-CORRECTED 


FACTOR Ana REDICTION oF 
CURTAILMEN 


ADMISSION Moner 


Factor/variables with high loading on factor Beta 
y weight 
: Eze cM eer 
e „3611 
sophomuate e (freshman, 
" economie y and Statistics, 
ating of letters of re i 
2, oi recommendati 
Undergraduate school excellence MR OR Ai 
Ver bete Excellence by Sek co} index (Total, | 2987+ 
3. Aptit db Quantitative? Mime 
" mission Faduate Management, study 27095 
Al rst Verb Iaduate Study in Business | ^^ 
2 Marital status!” Verbal, Quantitative) 
rital status 2069 
onire 
Vint average (science) nidi 
vity 
YS 16144 
n social Organizations son 
ri 
ate grade point ay, i ang 
* Point average (engineering) 
-092 
Frtragurricular Sports activity 
ating oj involvement in SPorts activity 005 
Number of sports activities Participate i 
Weighted sports activity a 
Nolte. Factors are listed in the [7 
" 5 are lis rd iri 
{Beta weight) derived from a monia Tenet importance 
graduate grade point average (R — ssec pd factors e 
*) < 10, ^ i 
"9 <05. 
dD C. 


stability of 
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TABLE 5 


: SSIONS 
Comparison or Decisions MADE nv ADMISSI 
COMMITTEE AND THE 


pMissions Mop 


Model 
Decision ————— = nam. 
Accepted | Rejected | Tota 
pos 
Committee | 0 
Accepted 90 | 30 be 
Rejected 30 72 22 
Toal | 120 | 19 


beta weights of each factor as a predictor 3 
Sraduate grade point average. tor 
In general, the curtailment-corrected idi 
analysis demonstrates an underlying pn. 
Structure which predicts graduate grade po i 
average as well as the seven-variable curt rs 
ment-corrected model. Undoubtedly, eet 
are more reliable than the individual pre d 
‘ors, but they also add a significant cost 
data collection expense and time. n- 

Admissions committees are usually ie 
cerned with categorical decisions of whether J 
accept or reject a candidate and not how Y 
the student will qo.: The results in Teen 
demonstrate the significant effect on aari 
sion decisions when the corrected mode 0 
used. The model would have rejected 25% 
the students whom the admissions ue 
accepted. Comparisons of graduate grade P by 
averages of all of the 30 students ap 
the committee but rejected by the model a i 
cepted group) with those rejected by the € ted 
mittee but accepted by the model (rejet of 
group) were not Possible because only "4 a 
the former group enrolled and obtaine ttet 
graduate grade point average, while the really 
Sroup never enrolled. Of the 13 who act" de 
enrolled, 11 Were below the average abe 
earned by the Class, offering support for 
model, 

Some may argue that the model is o ied 
ful for making marginal decisions. Pe paint 
the average Predicted graduate grade a 
5.7, "8€ for the 30 in the accepted gro! — 

"^, CO pared to 6.66 for the rejecte e aver 
In the applicant population of 222, th fant 

TAn exception is in allocating scho kath ent 

to those stud 


4p 
ents who show the greatest } 
to perform Well at school, 


use” 


t 


ce - oce idi 
E 
< 
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age rank (using predicted graduate grade 
point average) for the accepted group was 169 
compared to 82 for the rejected group.* The 
13 students rejected by the model earned a 
graduate grade point average of 6.02, the same 
as the average predicted graduate grade point 
average for the 102 rejected applicants. The 
average rank of these 13 students was 159 in 
the applicant sample of 222. These compari- 
Sons demonstrate a significant improvement 
resulting from the use of the model. 
ome caution is needed in applying curtail- 
Ment correction formulas. As in the present 
Study, the construction of matrix V and its 
Sübsequent use in developing the corrected 
eee should be tested by applying the 
er ected equation on a new sample of en- 
ed students, Since the researcher must wait 
E Criterion data in order to validate the re- 
p equation it is unlikely that the next 
mod pens) selected will be based on the 
stude ; and thus, if the population of admitted 
bby cuts does not change significantly, one can 
tain an unbiased predictive validation of 
lé model, 
Another problem that should be considered 
ith respect to the use of academic criteria is 
‘i Such criteria are only an intermediate 
a ™ career development. There is no guar 
€ that graduate grade point average will 
* a valid predictor of future managerial suc- 


Cesc 
fon In a previous study, a moderate rela- 
orb had been found between graduate 


u de point average and a criterion of career 
ress for Graduate School of Industrial 

use gp tration graduates, thus supporting oa 

vat, the academic criteria for the mo e 

in Sped here (Weinstein & Srinivasan, 
Dress), 


Curtailment-corrected model such as the 


e : A sb 
User 7 ; veloped in this paper can be extremely 
fin., Doth as a screening device and as a 


x ng cision tool. As cited earlier, there 15 
is Ps evidence that a sound actuarial model 
degjgt rior to clinical judgment in complex 
Selects, making. Even if used in a preliminary 

A . E 3. 
meg, 9" Stage, a model similar to the curtail 


ity | Crrected model may improve the qual- 


— ^ Admissions decisions. 
E 


8 
Tt : 
Avey, eould be noted that a greater graduate point 
Corresponds to a small value for the rank. 


CONCLUSIONS 


Although one could argue the superiority 
of the curtailment-corrected over the uncor- 
rected model purely on theoretical grounds, 
the evidence offered here contributes addi- 
tional empirical support for the improved pre- 
dictive validity of the corrected model. 

Four major findings have been demon- 
strated: 


1. The variables which enter the corrected 
model differ from those entering the uncor- 
rected model. In addition, the regression 
weights and validities of the predictors are 
altered. 

2. The selection model corrected for cur- 
tailment was superior to the uncorrected 
model in its power to predict future academic 
performance. 

3. Some variables such as undergraduate 
economics and junior grades, the Verbal por- 
tion of the Admission Test for Graduate 
Study in Business, and the Verbal portion of 
the Candidate Excellence by School index 
are important predictors of graduate grades. 
In addition, a regression of nine factors (de- 
rived from the corrected variance-covariance 
matrix of the predictors) with graduate grade 
point average indicates that past academic 
excellence, undergraduate school excellence, 
aptitude for graduate management study, 
marital status, extracurricular social activity, 
and undergraduate excellence in science and 
engineering are important determinants of 
graduate performance. 

4. Admissions decisions may be greatly af- 
fected by use of the selection model. The 
model rejected 25% of the applicants ac- 
cepted by the committee. A close analysis of 
these decisions shows that they were not 
marginal cases and some candidates with con- 
siderably more potential than those accepted 
were, in fact, rejected by the committee. 
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PREDICTION OF ADVANCED LEVEL AVIATION PERFORMANCE 
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SELECTION VARIABLES * 


RONALD M. BALE,? GEORGE M. RICKUS, Jr.,° 
anp ROSALIE K. AMBLER + 


| Naval Aerospace Medical Research Laboratory, Pensacola, Florida 


The criterion of success versus f 


permitted cost effective estimates of the 


ailure in undergraduate flight training has 
probability of an applicant or student 


completing naval flight training. However, a prediction problem remains for 


| (RAG), or postgraduate, p! 
correlation an 
Undergraduate 
the obtained regress 


» validation sample would have been reduced by 
graduate training that are “mission orient 
skills contributed the most to the explaine 


Een of success of naval aviation stu- 
Ds Das been a continuing effort of Navy 
l Ychologists, Considerable cost-effective work 
of Asin done with respect to the prediction 
ncm in the undergraduate level of flight 
wd (Ambler, Rickus, & Booth, 1970; 
ter. ire, 1958; Booth & Peterson, 1968; 
Roe son, Booth, Lane, & Ambler, 1967; 
oy nberger, Wherry, & Berkshire, 1963); 
f fe ae the problem remains of predicting 
te more remote criterion of operational 
Brad rmance, Some pilots complete the under- 
, Wate phases of flight training only to fail 
Diss draw at the postgraduate level or 
ond, 
^ cui distorically, the development of a remote 
ata ion against which to measure proficiency 
et M IPIS task has been a much desired 
"àviatips "e goal. The major problem in naval 
peratio thus far has been the nature at the 
E Vue setting in which the naval aviator 
Teport OPinions and conclusions contained in this 
Sarily vene TE of the authors and do ee = 
tment i the views or endorsement of the 
: enald M mae i University of 
Mati au Bale is now at the University 
, 110, 
tion "Orge M. Rickus is now with Bendix Corpora- 


South PRE 

*R field, Michigan. . 
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Naval 


inci 
a 


some designated aviators who are not successful in the replacement air group 
hase of instruction. This study employed multiple 
to examine RAG completion as a remote criterion variable. 
ning grades significantly predicted RAG completion. Had 
ion weights been employed, the attrition rate of a cross- 


33.8%. Those skills in under- 
ed" as opposed to academic or flight 
d criterion variance. 


must function. A method has not yet been 
found to equate the characteristics and diff- 
culty of missions across aircraft type in order 
to get a reliable and useful measure of pilot 
performance. The possibility of rating indi- 
viduals within squadrons with respect to per- 
formance has been explored (Berkshire, 1958; 
Boyles, Prunkl, & Wahlberg, 1969; Jenkins, 
Ewart, & Carroll, 1950). Though isolated 
efforts with this approach have been fruitful, 
the personnel in question are usually reluctant 
to divulge information perceived to be detri- 
mental to a fellow aviator, and the method is 
not routinely applicable. 

Naval aviation students require about 18 
months to complete the undergraduate phases 
of flight training. The time varies slightly 
with aircraft type. Upon completion, the stu- 
dent is designated a naval aviator. Prior to 
fleet assignment, the new aviator must com- 
plete approximately 6 months of replacement 
air group (RAG) training. This postgraduate 
phase provides intensified instruction in the 
techniques and mission of the specific fleet 
aircraft to which the pilot will eventually be 
assigned. Since the RAG requires the newly 
designated aviator to perform tasks almost 
identical to those demanded in the fleet, it is 
reasonable to assume that RAG performance 
is predictive of fleet performance. The RAG, 
therefore, can be considered a potential source 
of criterion data beyond undergraduate train- 
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TABLE 1 
POTENTIAL PREDICTORS ARRANGED IN TEMPORAL 
SEQUENCE OF ACQUISITION 


Final 
| B 


weight 


Description/Variable 


Selection tests given at recruiting center 
Aviation Qualification Test (verbal and 
numerical intelligence) 
Mechanical Comprehension Test 
Spatial Apperception Test 
Biographical Inventory 


Math Exemption Examinations 088 
Physics Exemption Examinations 


Courses preceding flight training 
Aerodynamics = 
Navigation 
Engineering (a 

accessorie: 
Aviation physiology 
Physical training 


iation power plants and 


Primary light pl 
trainer) 
Pre-Solo 
Precision 


ane training (‘1-34 Propeller 


Basic flight trai jet trainer 
Transition ud 
Precision/acrobatics 

asic instruments (Basic) 
Night flying s" 
Radio instruments 
Formation 
Carrier qualific; 


ning in T-2 B/C 


ation (basic) 


Basic ground school final grade 
Advanced flight training jn high performance 
^, Jet trainer (F-9] or AA) 
Tansition 
Basicinstruments (advanced) 125 
nstrument Navigation Br. 
ai 


ti 
f iliarization 
Operations /navigativn 


Air to ground weapons 
Tactics 


Air to air weapons 
Carrier qualification (advanced) 


Advanced ground school final grade 


070 

? Passing. grade on thi emption exami 
before the student. begins ac emic courses, Failure; , 
remedial instruction, indus 


Te given 
ng. This position is supporteq f 


urther į 
vork of Jones (1959) which Tihe 


discussed the 


R: M. Bate, G. M. Rickus, AND R. K. 


AMBLER 


utility of simplex theory and a — 
approach to training performance and pre S 
tion; that is, given a sequence of ul 
B, C, and D, D is predicted better by C a 
by either A or B. In this case, one might br 
sume that RAG performance would be M 
best predictor of fleet performance and a 
RAG performance would be a good in id 
mediate criterion for assessing C» 
performance. This study, therefore, "-— 
gated the relationship between performance i 
the undergraduate phases of naval em 
training and the postgraduate or RAG phase. 


METHOD 
Subjects 


The sample group for this study included a 
designated naval jet aviators, 218 of whom vg 
assigned to RAG training in East Coast squadr asl 
and 374 of Whom were assigned to the West s 
during the Period of November 1966 to DOT 
1967. Excluded from consideration were those fica 
ators who were dropped from training for medic 


: ion, 0T 
reasons, personal hardship, disciplinary action, 
death, 


Procedure 


ergraduate pilot training were 
subject. These were considered Me 
A brief description. n 
5 is contained in Table 1. Additis: 
9 completed RAG training eu 
ore of 1 and b» P 
igned a score O 

alidation, the total m 
© subgroups that correspon! ts- 
xisting division of the East and West bs 
This Sample division was Considered to be D 
appropriate th andom split, which might P! 


fully Were 
Who faile 


"Or purposes of cross-v. 


mo- 
Š h a greater degree of euh 
feneity and thus result in à less. conservative yere 
mate of cross-validity, Failure or attrition rates W 
then 


the 
calculated for cach subsample and day ncy 
total sample to demonstrate subsample equivalen® 
with respect to the criterion distribution. 


sent 
A Wherry-Doolittle multiple correlation coefficieh 


e 
d for the larger, West Coast Ae 
"^ 33 potential Predictor variables pros cri- 
Ormance measure as the tes 
F ratio was calculated ta the 
significance of the increase m eat 
€R resulting from the addition pM 
Predictor variable to emerge in teess 
: When the obtained F value widere d 
> Additional Variables were not DNE this 
. AW score Tegression weights derived from sam- 
multiple Correlation analysis on the West Const E 
then useq to compute predictor (regres: he 
Scores for al] Subjects in the East Coast sample- 


variable, An 


EB adn... 


PREDICTION OF ADVANCED LEVEL AVIATION PERFORMANCE 


Doint-biseria] c T -— 
i ete. correlation coefficient between the re- 
ail E Ae rbt of predictor scores and the pass/ 
vali riterion was considered the index of cross- 
alidation, 


RESULTS 

Le attrition rate for the combined 
Coast of 592 aviators was 13%. The East 
input Squadrons lost 13.3% of their initial 
i and the West Coast group lost 12.89. 
tion tiple R and F tests resulted in the 
predicto. 3E iat these 33 training grades as 
B on: at the criterion. The resultant R 
ittle vimm for shrinkage by the Wherry-Doo- 
Weights foe was 43 (p < .001). The beta 
in Tay or the selected variables are included 
ol aaa correlation coefficient 
9r the E = predictor scores and the criterion 
Which i ast Coast sample was 36 (P < 001), 
cient, S a reasonable cross-validation coeffi- 
oe S niei empirical results, that is, 
examin M cross-validation, it was decided to 
a the the findings from two points of view: 
ing t5 relevance of various portions of train- 
Trea criterion and (5) the practical 
jecti ton of results. To accomplish the first 

e io the variance explained by the multi- 
on. aon with the criterion for the total 
Ways Ss = .46) was partitioned in two 
ained able 2 shows the proportion of ex- 
Brades criterion variance among the various 
Bories clustered in terms of meaningful cate- 
Were S training elements.” These categories 
it iş o inea on an a priori basis. In Table 2 
issie en that those measures concerned with 
Bes On/combat skills accounted for the lar- 
amount of explained variance, with flight 


Joss TABLE 2 
Ts idiom OF VARIOUS 
NING TO PREDICTION € 


or AVIATION 


F'raining element Proportion of explained 
varia e 

Select: je = 
PER) tests 062 
P moie training .095 
Might ad training 012 
Inst Skills 278 
Mis ment skills .191 
“on/combat skills .362 
otal 1.000 
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TABLE 3 


CONTRIBUTION OF VARIOUS ELt ps OF AVIATION 
TRAINING TO PREDICTION OF SAT TORINESS— 
CLUSTERED BY TEMPORAL SEQUENCE 


‘Training elements Proportion of explained 
variance 
Selection tests 062 
Preflight training grades | 061 
ALS 
(042 
143 
001 
Advanced flight grades 576 
1.000 


Total | 


skills and instrument skills also making a 
significant contribution. Table 3 shows the 
same kind of data for the grades clustered by 
temporal sequence of occurrence within the 
undergraduate pilot training syllabus. The 
simplex model is afñrmed by the fact that 
advanced, basic, and primary flight grades 
contributed in that order. It should be noted, 
of course, that advanced flight consists mainly 
of instrument and mission/combat skill acqui- 
sition. The proportions of explained criterion 
variance displayed in Tables 2 and 3 were 
obtained by using a forcing function in suc- 
cessive computations of R. This technique 
forced grades sequentially by cluster into the 
R computations so that percentage of variance 
explained could be identified for each cluster 
independently of the others. 

In order to explore the potential applica- 
tion of the predictor scores, separate fre- 
quency distributions of predictor scores for 
the successful and nonsuccessful aviators in 
the East Coast Or cross-validation sample 
were compiled. The proportion of successes 
and nonsuccesses at or below a given score 
could then be examined. By this means it was 
possible to identify an optimal cutoff score 
that would screen out the maximum number 
of potential failures at the cost of a minimum 
number of successes. The projected reduction 
in the East Coast attrition rate was then com- 
puted as if the sample had been screened on 
this cutoff score. These data are shown in 
Table 4. Had a predictor score of 760 been 
utilized as a cutoff point, 41.4% of the non- 
successful aviators could have been rechan- 
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TABLE 4 
PERCENTAGE oF SUCCESSFUL AND Unsuccessrur 
REPLACEMENT AIR Group AVIATORS BELOW A 
GIVEN PREDICTOR SCORE FOR Cross- 
VALIDATION SAMPLE (Easr 
Coast SAMPLE) 


% unsuccessful © successful 
Predictor score at or below at or below 
predictor score Predictor score 
(n= 29) (n = 129) 

550 34 0.5 
655 10.3 24 
760 414 6.9 
850 65.5 19.0 
955 79.0 33.3 
1,060 89.7 60.8 
1,150 93.7 81.5 
1,210 100.0 89.9 


the jet RAG 
would have 


neled prior to commencement of 
training. The false negatives 
amounted to 6.9% of the succes; 


Under the existing system, the East Coast 
jet RAGs suffered a 13.3% att 


from 13.3% to 8.8% or 
of 33.8% 
Discusstoy 


The intent regard 
actuarial data of the type shown in Table 4 js 
not to discard 6.99 


to eliminate 41.4% of 


M nne the failures The intent 
. unimize or Prevent the Misassignment 

91 individuals by diverting them to the type of 
a and mission Assignment in which they 
se : Probabilit of Success, Actuarial 
fend Nous flight Options must be 
ped. aen there would be a basis for 
preventing misassien ments " 


‘ s by diverting 

with poor Probabilit , 
Y Of success ; 

Bal es ess in one g 


Which the f 
Probability of Success, Fo oe 


who is excluded from flying 
might be quite Proficient in 
versa. Those, h 
all options should be « 
feasibility of thi 


those 
ption 


the p.3 
ho ar 


‘selecte; 
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The group of aviators studied here can A 
considered a highly restricted group in "L 
as they are the Survivors of vigorous -—— 
and training procedures, F urthermore, wilt 
selected for jet training are nearly E. s 
drawn from the top of the cae 
early ground school and primary flight gt later | 
The fact that it is possible to -— oe 
Success within such a highly restricted pu. 
is encouraging. The next logical step ioi 
expand the RAG criterion from a me | 
to a continuous type of performance pe 
This will require Standardization of pim d 
procedures across all RAGs, Given " the | 
measure, coupled with the rationale x fleet | 
Successive approach to prediction o reed m 
performance, another goal could be Ps rele 
that is, those aspects of training most the 
vant to success in the RAG and hence vision 
fleet could be identified with greater bee 
than was possible in this study. This RE 
could provide valuable feedback for impr 
monitoring of Syllabus effectiveness. 
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USE OF MODELING TO MODIFY CHILDREN'S RESPONSES TO 
A NATURAL, POTENTIALLY, STRESSFUL SITUATION + 


DAVID T. 


University 


A. VERNON ° 


of Missouri 


Prior research on modeling suggests that the sight of another person behaving 


calmly in the presence of 


a feared stimulus reduces later aversive behavior 


on the part of phobic (or especially fearful) persons. The present study 


tested whether such an exper 
a commonly 
subjects exposed to such modeling 


anesthesia induction than did control subjects. 
mental and control subjects were smalles| 


ience would similarly affect children experiencing 
feared stimulus (ie. anesthesia induction). It was found that 
behaved as if they were less aíraid of 


Differences between experi- 
t during those phases of anesthesia 


induction which were not portrayed by modeling. 


Te findings of several recent studies sug- 
in a dat the observed behavior of one person 
e potentially stressful situation will influ- 
exam hg later responses of others. For 
Ed s the sight of another person behaving 
ep ^, With a feared but harmless object 
E a harmless snake) encourages the ob- 
o i. Ue approach the object more closely and 
Woul yes with it more intimately than he 
iy me (Bandura, Grusec, & Men- 
T 967; Bandura & Menlove, 1968; Geer 
» vrtletaub, 1967; and the earlier descrip- 

Wae of Jones, 1924). : 
Deen ul the findings of these studies have 
emp] Consistent and clear-cut, the methods 
ng Syed would seem to greatly limit their 
ere zability. Three of the four studies 
i Carried out in laboratory settings using 
our o ity-affiliated persons as subjects. All 
er Studies used stimulus objects that many 
, harmless animals) 


~ons do not fear (e.g. 


1 
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o Venen for reprints should be sent to David Js 
hivers School of Medicine, B 129 TD-3 West, 
Y of Missouri, Columbia, Missouri 65201. 


and compensated for this by using subjects 
who were either phobic or especially selected 
for their high fear. All of these studies used 
research designs in which the primary depen- 
dent variable was the degree and quality of 
the subjects’ contact with a once-feared 
stimulus object. 

The major purpose of the present study 
was to explore the degree to which the above 
findings were applicable to a much different 
type of setting, that is, a natural field setting 
in which a commonly feared stimulus was 
experienced to an equivalent degree by all 
members of a sample that was not specially 
selected for fearfulness. The study was car- 
ried out in a children's hospital. The children 
in the experimental group were “prepared” for 
anesthesia induction by showing them a movie 
of other children responding calmly to anes- 
thesia induction. Data on their behavior were 
then compared to those of a control group 
which received no special preparation. 

The following hypotheses were tested: 


1. Children prepared as described above 
behave more calmly in potentially stressful 
situations than children not so prepared. This 
pertains to their behavior (a) during the 
period of threat prior to the actual impact of 
the stress stimulus and (5) during the period 
of stress impact itself. Since the prepared chil- 
dren experience less stress than children who 
are not prepared, their behavior (c) during 
the postimpact period—that is, during the 
period following discharge from the hospital— 
gives less evidence of emotional distress. 


351 


352 


A second purpose of the study was to in- 
vestigate the extent to which the effect of 
preparation varied with birth order, It was 
anticipated that the effects predicted above 
vary with birth order in the following manner. 

Dh Early-born children (firstborn and only 
children) are influenced by preparation to a 
greater extent than later-born children during 
the (a) threat, (5) impact, and (c). post- 
impact periods (that is, there are significant 
interactions between birth order and prepara- 
tion). 

This hypothesis seemed logical since early- 
born persons in potentially stressful situations 
have been shown to be more subject to the 
influence of peer affiliates than are later-born 
persons (Sampson, 1965; Schachter, 1959; 
Wrightsman, 1960; Zucker, Manosevitz, & 
Lanyon, 1968) and such peer affiliates may, 
in effect, constitute “models” as the term has 
been used by Bandura and others, 


METHOD 


The general plan of thi 


€ experiment was as follows: 
The subjects were childr 


en w] 


responses were 
od ratings were 
ne day later the 
test of their 
aca = 
dures. Six days 1; am fed a 
Dosthospita] beha: questionnaire i 
Changes in their children’s behavior si 
tions, Finally, these last tw i 
lest and the Posthospit: 
Were repeated 30 q 


nce the opera. 
(the Projective 


al behavior questionnaire) 


ays after discharge, 
Subjects 


The 38 Children 


Who were the subj i 
à Jects of this 
Study were Under the Care of a general pediatric 
Surgeon and an Cr, nose, and th 
included 9 į 


i aroat specialist, They 
nguina] h 4 cy 
dren having tonsil and Patients, 19 chil. 
forms of minor el 


E and 10 havin 
lective Surgery, E other 


herniorrha 


to nine years) 
evenly divided between boys and girls, 
All children having op, 
fell within the Prescribe 
to the same treatment Condition, either 
or control. The assignment 
ments was randomly dete 
had no contact with the 


assigned 
Cxperimenta] 


_ these ÉrOups to treat. 
rmined, 


n e investigator 
children Or their Parents, 
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infor- 
and prior to treatment assignment he Had Dad E. 
mation about them apart from their age a 


Experimental Treatments 


E experimental 

The film that the subjects in T PA 
group saw prior to surgery showe nine year 
(actors), ranging in age from five to r to anes- 
responding calmly and without ee eae for 
thesia induction. The film was Ha E scribed 2$ 
the models’ behavior, which cannot be e the 
typical). It was made in the hospital ital person- 
research was conducted, and it used hospi sible was 
nel as actors. The equipment that was vore similar 
Standard, and the techniques employed we 
to those used by most anesthesiologists. dr parents 

The subjects saw the film without e two oF 
either individually or in small groups uw to 4 
three children each. It was shown from d prior 
minutes before the subjects left for surgery RE film 
to preoperative medication. At the start D. ld show 
the children. were told that the movie wou ir oW 
them what they would sce when they had ae 
Operations. The entire film lasted 12 minu 
Subjects’ attention was invariably excellent. m. 

The children who did not see the movie S att ] 
beside their beds with their parents and C ciate 
read, etc. Apart from the movie and : the 
activities (e.g., contact with the geom iden- 
experiences of the two groups were seeming E 0 
tical and in no Way different from the experie’ 


in the 
i in 

other children having the same operations 

hospital. 


n or 


Instruments and Their Application 


by E 

Global mood scale. Global mood was rated Shy H 
7-interval scale (Vernon, Foley, & Schulman, "e and 
the intervals of which ranged from duran ta 
active in happy or contented way" (score! cryin í 
“scream full blast, intense and constant s four 
(scored 7), Ratings were averaged acros® ipe 


into 
Phases: (a) Threat Phase A—from entry Parating 
Surgery until the child Started to enter his OP the 
room, (b) Threat 


to 5 
Phase B—from entry nesthesit 
til the beginning of ynute ° 
act Phase A—the first m es T 
7; and (d) Impact lad level ° 
the end of Impact Phase A until a surgica j 
ached, je w? 
ca 5 
on Of the global mood si SUE 
of prior research in Ha 1967) 
ance, 1968; Vernon et reement nt 
(1968) reported the percent AE depende 
ratings between pairs : i 
to range from 79% to 91% d e 
FsPOnses to injections, Torrance also Baie r a 
individual comparisons of global mood cm in£ ing 
With two series of telemetered heart rate 1 result 
41 children undergoing injections. ' E "n 
Correlations had media pr ME o 
(P< 05 in both cases Piaje. 


i sci 
modest Support for the validity of the 


F 


| 
| 


— 


paon in the present study the global mood ratings 
E Threat Phase B and Impact Phase A showed 
este correlations with independent ratings of 
a rt fear made by the anesthesiologists according 
9 a 3-interval rating scale (r = 46 and .53, respec- 
tively, p < 01). 
E eius test. This was a modification of a test 
anes by Amen (Dorkey & Amen, 1947). It 
face = n of a series of hospital scenes in which the 
of m the central child was left blank. The task 
sad. e child was to select the face—either happy Or 
GEL he felt was appropriate to the action. 
DE see was administered on two occasions. The 
e e on the day following surgery, either at 
ds E hortly after the child had been discharged 
in thes the case for the tonsillectomy subjects) or 
Maing iba shortly before discharge. The second 
after di ration took place in the child's home 30 days 
Fone from the hospital — 
number research supported the validity oí the total 
measur of sad faces assigned to hospital scenes as a 
ization. of fear or state of anxiety during hospital- 
assigned that is, the total number of sad faces 
Nurses? ; to these scenes correlated significantly with 
leve] ur IMMO ratings of (a) children's general 
and (py ear and nervousness during hospitalization 
terms yi children’s ability to get on close friendly 
pith nurses. 
tos hospital behavior questionnaire. This consisted 
of ie (e.g., “Does your child seem to be afraid 
follow ing the house with you?” “Does your child 
Chilq MOM everywhere around the house?" “Is your 
childs raid of the dark?”). It was filled out by the 
Rise aether on two occasions—6 days and 30 days 
item n child's discharge from the hospital. On each 
e e mother compared the child's behavior in 
Past week with his behavior before hospital- 


Vati : : 
9h. Five response alternatives were provided, 
(scored 1) to 


Tang; 
$ me from “much less than before” 
more than before” (scored 5). A total score 
the individual 


Wi 

al calculated by simply summing 

ERG indicated above. 

an, & Work with this questionnaire (Vernon, Schul- 

Validity Foley, 1966) has provided support for the 

tetest ; Ge., agreement with interviews) and test- 
reliability of such scores. 


RESULTS 


The wena: j 

nes principal statistical tests used were 
LY analyses of variance involving the 
Pendent variables experimental treatment 


rej $ ; 
orga ration vs. no preparation) and birth 
~“ (early born ys. later born)? 


rain 


Pri i 
frog 9r to the above analyses, the four major 


s (5 
ontrol Gie., early-born experimental, early-born 
Which * etc.) were compared on three variables 
fadin Seemed particularly likely to confound the 
, age, Sex, and 


" 

Phys Of the critical comparisons: 

and thro; (i.e., general pediatric surge! 

9E in er. at specialist). In no case did the main effects 
actions approach statistical significance ; the 


n vs. ear, nose, 
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m 
a 
O 40 
wn 
o 
o 
o 
= 30 
m 
E 
S 
o 20 


o 
THREAT A IMPACT A 


THREAT B 


IMPACT B 
PHASES OF ANESTHESIA INDUCTION 


Fıc. 1. Mean global mood ratings: Four phases. 


Behavior Before and Immediately After the 
Operation 


Data relevant to the subjects’ behavior 
from the time of their entry into surgery until 
a surgical level of anesthesia was reached are 
summarized in Figure 1. 

With regard to these data there are several 
points which bear emphasis. First, the data 
support Hypothesis 14 quite well. In both the 
threat phases (Threat Phase A—waiting in 
the surgery hall prior to entering the operat- 
ing rooms—and Threat Phase B—from the 
time they entered the operating room until 
anesthesia began), the subjects who had seen 
the movie appeared to be considerably less 
frightened and upset (F > 10.60, df = 1/34, 
p < .001, for both comparisons). 

Second, contrary to the expectations of 
Hypothesis 15, this difference dwindled in the 
impact phases. In the first minute of anes- 
thesia induction (Impact Phase A), the differ- 
ence only approached significance (F = 3.82, 
df = 1/34, 10 > p > 05); during the en- 
suing period of induction, differences were 
almost nonexistent. (Most of the children 
were asleep during this last phase.) 

Third, Hypothesis 2a—that the effect of 
preparation on threat phase behavior is most 
pronounced in the early-born children— 


four groups were reasonably well balanced on the 
variables. 
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TABLE 1 


MEAN POSTHOSPITAL BEHAVIOR MEASURES 
(7- AND 30-Day FoLtow-up) 


| — 


| day follescap. | 30-day follow-up 
i | 
Sublesre | 
a EEE E] Hospital | PHBO| Hospital 
total | picture | total picture 
score | projective | score | projective 
Early born | | 
Prepared SEE ST d eee]! ma 
Not prepared 8| 694 3.6 69.5 4.0 
Later born | 
Prepared | 10 | 71.4 4.2 68.0 4.4 
Not prepared | 5| 71.7 4.8 69.9 | 4.3 
| | 
Note. PHBQ = Posthospital behavior questionnaire, 


received modest Support. Viewing the film of 
the models’ behavior was associated with 4 
reduction of approximately 2.0 global mood 


Scale points in the early-born children and 


-born children in 
children waited in 
er, this difference 
Significance (F= 
1 10>p> 05), 
the expected interaction Was even smaller in 

(F= 247, df= 1/34, p > 
expectations 


: Tap IE 
the differentia] effect of Preparation 


and later-born children did nog 
dicted in Hypothesis 26, 


* The sample size here 


(N= 35) is th 
that reported earlier, Two mothers Wer les than 
uncooperative on follow-up. One family Passively 
could not be located, * Moved and 


Davin T. A. VERNON 


to reduce the degree of upset of the later-born 
children and increase the degree of upset uU 
early-born children, the latter being ex 
to expectations. This finding appeared to j 
consistent across both measures: It he 
Proached statistical significance on the ee 
tive test (F = 3.39, df = 1/31, .10 > p> a 
and reached Statistical significance on Ee 
total posthospital behavior questionnaire sco 
(F = 4.53, df = 1/31, 05 > p > 01). r 

The data from the parents’ ratings for B 
fourth week look quite different. The Sd 
cant and near significant Preparation x BA 
Order interactions that were evident in i 
first-week data disappeared (F « 1.0), and t 
their place there was a significant main ater 
for Preparation om the total posthospit@! 
behavior Questionnaire score (F = 5.19, df T 
1/31, 05> 5 01). The direction of a 
differences at this time was in line with Hy 
pothesis 1c; the children who were prepar 
for anesthesia induction by seeing the mo 
appeared to be less upset than those not P” t 
Pared. Inspection of the means revealed thas 
the disappearance of the significant dee 
actions and the appearance of the significa" 
main effect for Preparation were due "m 
Changes in the ratings of the early-born e 
dren who saw the movie. The total vns 
hospita] behavior questionnaire scores of thes 
Persons showed a decrease in behavior Ex. 
ative of emotional upset while persons in oF "e 
groups (e.g, early-born children who did 7 
see the movie) remained relatively stable. 


Discusstox 
The main fea 


" 10 
extend and Complement the findings of P 
research on the use of models to reduce 


aversive behavior. have 
Cts of viewing a model begren 
: Y appear to apply to unselected chi well 
in a commonly feared stress situation 2? 
as to phobic Persons. rat 
OWever, the effects of this prepar? 
Were not uniform across all measures. ts 
in the present Study the prepared subject? i 


ion 
us; 


not exhibit advantages during anes w 
Induction itself or during their first pis 
alter discharge (in the initial follow-up)* m- 
May have p 


sc 
een the result of (a) the p 
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forting and unexpected sensations caused by 
anesthesia and normal postoperative pain and 
(b) the fact that these two aspects of the 
Subjects! experience were not covered at all 
in the preparation film. Such factors may have 
Produced relatively high discrepancies be- 
tween expectations and experience and, cor- 
Tespondingly, a transitory period of behavior 
Which Was contrary to the hypotheses at the 
times of maximum discrepancy (Janis, 1958). 
_The emphasis on expectations as an inter- 
Viewing variable that explains variations in 
the effectiveness of the modeling film also 
grees with another feature of the data. The 
Subjects who saw the movie acted as if they 
vpected anesthesia induction to be benign. 
hey were relatively calm during the initial 
threat phase (Threat Phase A)—while wait- 
ing lo enter the operating room, à setting 
Which had not been covered in the movie. The 
differences between the prepared and unpre- 
Pared Subjects were as great during this phase 
E they were during Threat Phase B, a phase 
Eh was covered in the movie. ] 
ex he above speculations about the role of 
nes ens in modeling clearly suggest. the 
inter for subsequent studies. However, it is 
m resting to note that this emphasis on 
XPectations agrees with current emphases 
n. ve factors in both investigations of 
ed (Bandura & Menlove, 1968) and 
estimations of other aspects of stress be- 
1966), (Haggard, 1943; Janis, 1958; Lazarus, 
The findings of the present study provided 
led limited support for the notion 
*sented by Hypothesis 2—that the effects 
f Viewing the modeling film would be greatest 
atte tt¥-born subjects. The ee 
the Ps Of means occurred to some an 
exti Teat phase mood data and, to. this 
Tego," SRrees with results of prior stress 
rch—both that dealing with the influence 
Suedel communication (Schachter, ae 
he e 1968, 1969) and that dealing with 
n, feria of affiliation under stress (ure 
loge, 265; Wrightsman, 1960; Zucker et a., 
B su ata from other phases offered even 
Noteq = for the second hypothesis, as was 
x arlier, 
Pract eration may also be given to the 
i implications of this research, despite 


the fact that the long-term sequellae of prep- 
aration are not entirely clear. The fact that 
preparation was associated with a diminution 
of upset at both the initial phase of the poten- 
tially stressful experience and later, approx- 
imately four weeks after discharge from the 
hospital, suggests that real benefits may 
accrue from preparation to children who must 
undergo such experiences. Whether or not the 
benefits of preparation were personally signifi- 
cant to these children as well as statistically 
significant is a question that cannot be 
answered with the data at hand. The same is 
true for the question of whether or not such 
benefits might have been increased with a 
more comprehensive preparation routine or 
repeated, graded presentations of the type 
iound to be especially effective by others 
(Bandura & Menlove, 1968). The children in 
this study were prepared for only a fraction 
of their total hospital experience—only for 
anesthesia induction and with a purely visual 
presentation. Had the prepared children also 
been prepared for the nonvisual sensations of 
the anesthesia, the postoperative pain and dis- 
comfort, the separation from their families, 
etc, they might have experienced more 
impressive postoperative benefits. 

The fluctuations and differences in the data 
from different time periods (as they were 
interpreted) suggest that preparation routines 
such as the one used here constitute effective 
means of reducing the stressfulness of poten- 
tially stressful situations only as long as they 
create accurate expectations. Thus, consider- 
able effort should be expended to make them 
as accurate and comprehensive as possible. 


REFERENCES 


Banpura, A., Grusec, J. E. & Mrxrovg, F. L. 
Vicarious extinction of avoidance behavior. Journal 
of Personality and Social Psychology, 1967, 5, 
16-23. 

BANDURA, A., & Mentove, F. L. Factors determining 
vicarious extinction of avoidance behavior through 
symbolic modeling. Journal of Personality and 
Social Psychology, 1968, 8, 99-108. 

Donxrv, M., & Amex, E. W. A continuation study 
of anxiety reactions in young children by means 
of a projective technique. Genetic Psychology 
Monographs, 1947, 35, 139-183. 

Geer, J. H., & TurTLETAUB, A. Fear reduction fol- 
lowing observation of a model. Journal of Per- 
sonality and Social Psychology, 1967, 6, 327-331. 


356 Davi» T. 

Haccarp, E. Some conditions determining adjust- 
ment during and readjustment following experi- 
mentally induced stress. In S. Tomkins (Ed.), 
Contemporary psychology. Cambridge: Harvard 
University Press, 1943. 

Janis, I. Psychological stress. New York: Wiley, 
1958. 

Jones, M. C. The elimination of children’s fears. 
Journal of Experimental Psychology, 1924, 7, 
382-390. 

Kisszr, S. Stress-reducing properties of social stimuli. 
Journal of Personality and Social Psychology, 
1965, 2, 378-384. 

Lazarus, R. S. Psychological stress and the coping 
process. New York: McGraw-Hill, 1966. 

Sampson, E. The study of ordinal position: Anteced- 
ents and outcomes. In B. A. Maher (Ed.), Prog- 
ress in experimental personality research. Vol. 2. 
New York: Academic Press, 1965. 

Scmacuter, S. The psychology of affiliation. Stan- 
ford, Calif.: Stanford University Press, 1959. 

Surprerp, P. Anticipated and experienced stress in 
Sensory deprivation as a function of orientation 


and ordinal position. Journal of Social Psychology, 
1968, 76, 259-263. 


A. VERNON 


" er 
Sueprexp, P. Sensory deprivation stress: Birth ot 
and instructional set as interacting p. 
Journal of Personality and Social Psychology, 
70-74. ; A 
ae J. T. Children’s reactions to intrams 
injections: A comparative study of needle ip 
injections. Unpublished manuscript, Bm 1968. 
of Nursing, Case Western Reserve Universi! Y» 
Vernon, D. T. A., Forzv, J. M, & SCHULMAN, cid 
Effect of mother-child separation and bem i 
on young children’s responses to two po d 
stressful experiences. Journal of Personali y 
cial Psychology, 1967, 5, 162-174. 1 
M id D, T V Ee M J. L., & FOLEY, ue 
Changes in children's behavior after. hosp P 
ization. American Journal of the Diseases 
Children, 1966, 3, 581-593. sh others 
Wricutsman, L. S. Effects of waiting with d 
on changes in level of felt anxiety. J01770 "it 
Abnormal and Social Psychology, 1960, 
216-222. okt 
Zucker, R. A., Manosevitz, M., & LANYON, PES 
Birth order, anxiety, and affiliation during 2 €T 


68. 
Journal of Personality and Social Psychology; 1905 
8, 354-359. 


(Received June 12, 1972) 


Journal of Applied P. 
A D ychology 
1973, Vol. 38. No. 5551-361 


FACILITATION OF LEARNING 


Baruch College, 


determine 
ganizers, postquestions, concr 


erence source. 
group of engineers. 
techniques. The fin 
information, found more 0 
time, and had mor 
group. Moreover, the version of 
portion of variance 


ate rge part of the information needed to 
Bai many types of industrial jobs is con- 
conve: b: technical manuals. In an attempt to 
Em that information to the job occupant, 
and als are often used as training devices 
2 egy sources. As a result, a technical 
jo ual can be an important determinant of 
Can Performs. for example, à job occupant 
it as arn much from a manual and refer to 
Many needed or he can learn little from a 
Mte and thereafter ignore it. Since it can 
Need job performance, à technical manual 
Bir to be designed in a way that facilitates 
ref ing and enhances its effectiveness as 4 
erence source. 
in c seeds of that design appear to reside 
em of learning from Prose (Ausubel, 
160; Ausubel & Yousef, 1963; Billey, 1970; 
3 ning, 1968; Frase, 1968; Leith, Brian, & 
oret 1962; Natkin & Stabler, 1969: Roth- 
1967. oo 1966; Rothkopf & Bisbicos, 
eae sie Cole, 1963, 1966). Using 
"md chosen prose materials in contro i: 
min ory settings, these studies tried to de- 
ized ^ whether the addition of a hypothe- 
eami acilitator to prose materials increases 
ethane: The results indicate that learning i5 
in E by (a) advance organizers, organiz- 
terial: cepts presented before the prose = 
after 4 (b) postquestions, questions inserte 
important passages; and (c) delayed 
1 
t quests for reprints should be sent to Neil C. 
uie nent of Psychology, Baruch College, 
rsity of New York, 17 Lexington Avenue: 


ork, New York 10010. 


manual facilitates learning and enhances th 


dings indicated that 
f the information required on 
e favorable attitudes toward thi 
the manual accounted for 


jn recall test scores, 
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whether the inclusion of advance or- 
and delayed review in a technical 
e manual's effectiveness as a ref- 
echniques was given to one 
given the same manual without these 
group recalled more 
a look-up test in less 
e manual than the control 
a substantial pro- 
and ratings. 


look-up test scores, 
review, a review of a topic sometime after 
that topic has been presented. 

The present study sought to determine 
whether these techniques also facilitate learn- 
ing from a technical manual and enhance its 
effectiveness as à reference source. In addi- 
tion, a fourth hypothesized facilitator—con- 
crete illustrations—was added in response to 
employees’ requests that examples and graph- 
ics be used in technical manuals to illustrate 
major points. 


METHOD 


Design of the Manual 

was carried out in a major 
Two versions of a tech- 
were used. The 


hypotheses that advance 
s, concrete illustrations, and 
g and enhance the 
these 


organizers, post 


delayed review 
manual's effectiveness as 2 reference source, 


four techniques were woven in 
manual. Although it would have been desirable to 
determine their separate and interactive effects, this 
was prevented by the constraints of the research 
setting, that is, by subject availability and by the 
objectives of the participating organization. Accord- 
ingly, the manual without the several techniques 
was used as the control version, while the manual 
containing these techniques served as the experi- 
mental version. The control version contained 32 
pages, the experimental version 42 pages. 

Advance organizers were defined as capsule sum- 
maries of each topic provided before the material 
relating to that topic. These summaries Were printed 


on 
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at the left-hand side of the page, with a box around 
-them to set them off from the rest of the text. 
Thirty-eight capsule summaries appeared throughout 
the manual. Their purpose was explained to the sub- 
ject in the introduction. j p 
Postquestions were defined as questions inserted 
after every major topic in the manual. These ques- 
i i ectives of the manual, 
: two questions 


Concrete illustrations were provided in three forms, 
First, three real-life stories illustrating some of the 
major points were inserted after the relevant ma- 
terial. Second, examples of what Was meant by 
certain items and rules were provided throughout 
the manual. Third, six figures w 


Postquestions pro- 
ll the Postquestions 
of the manual 2 


vided delayed review; that is, al 
Were printed together at the end 


Subjects 


The subjects were 40 electrica] en 
different amounts of job ex erience, Engincers with 
different amounts of experi; 


the manual was intended 


à new em. 
ployees as an orientation to the job and by all em- 
ployees as a reference source. In addition, we 
wanted to determine whethe: 


moderate the i 


The amount l learned w; 
a recall test containing siy question; 
were developed by a panel of six : 
visors who felt these wW : 
should be learne 


d from 
The effectiveness of 


of materia 


a reference 


"Y Scores on a four-questi 
by the time it took the & ‘question 


S as representa- 
n engineer would 
e of a job, Each 
d in terms of its 
use for learning à new 


information a 
need to look up during the cours, 
version of the manual was rate 
format, organization, clarity, 


? The experimental an 


d contro] versio 
manual can be found in 


ns 
Barrett (1972), 5" fis 
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istic Was 
job, and use íor reference. Each characteristic V 
rated on a 6-point scale. 


Procedure 


Subjects were randomly assigned to experiment 
and control conditions with the sig ge E 
Subjects at each level of experience be assigne a 
each condition, Subjects were then randomly ws 
lo testing sessions, The testing was conducted Nes 
10 subjects in a session, A classroom containing in s 
vidual desks for cach subject and for the ea 
menter was used. Before each session, each subjec 
name and code number—indicating his level of le 
Perience, the version of the manual he would ar 
and his persona] identification number—was writt 
on a card and placed on a desk. — SE 

When the Subjects were seated at their assig 


3 ing instruc- 
desks, the experimenter read the following instr 
tions: 


You a project to improve 
the company's technical manuals. We've deyélape 
which we're trying 
“ppreciate your help in this study. onde 
'm going to give cach of you a manual to Pa 
lease read it as if you had to learn the job D 
electrica] engineer from it, When you feel confide 
that you've learned the material, please raise yO 
hand and TI) collect the manual from you. $- 
After reading the manual, you'll be given à Ke 
i * lounge on this floor. Du 
Se the facilities in the loun£e 
don’t discuss with each oy E 
ust read. After your break, you'll 53 
given two tests. The first test asks you to remani 
ber some of the major points in the man 
ease raise your hand when you've finished, 
€ Second test asks you to look up some in the 
mation in the manual. For this test PIL give "v 
test to you one question at a time, When e 
finished a question, bring it to me and I'll give Y e 
the next one. After the last question, you'll 
Siven an attitude Survey to complete. a in 
a card with a number on w 
- This is your code number. It ann al 
"ll be given and is used to pre ith 
- Please take your number V 
9n your break, re 
that your neighbors take "You 
you do during this gee 
from others. We're ae 
different versions of the m^ yont 
© please don't be concerned with what vrai 
else is doing, but Work at a speed which i5 nd i 
fortable for you, We're not testing you e 
makes n rence how fast or slow you 3 don't 
ut the manuals now. Please 
I say “begin.” 


trying out two 


TI hang 


1, the 
` ual, "m 
3 ery subject had received a man 
Experimenter hey, had re 


n e 
x (old them to begin reading. wW M 
Subject raised his hand, his manual was collec ard 
time wa. Note 
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TABLE 1 


MEANS AND STANDARD DEV 


‘ions or RECALL Test Scores, Look-Up Test 


Scores, AND Look-Up Test Times 


| Recall test score* 


Look-up test score | Look-up test time* 


Condition —— | : 
| * | SD | x SD | ï SD 
: m | cies m | 
3 imental | | | | 
l Most xperienced | 17.9 44 13.5 0.8 | 24.4 7.0 
experienced 19.2 34 13.3 1.0 26.6 9.9 
18.5 i0 | 84 09 | 255 | 86 
* | | | 
t ost experienced 2 2.6 12.4 1.4 33.4 | 43 
„East experienced 2 2.9 11.2 1.2 28.1 6.4 
Total | ; | 2.8 11.8 14 30.8 6.1 
n | 
LA n ore was worth 25 points. (Each of 6 questions w ned n deren i usd 


ore was worth 14 points, (Eae 
as recorded in minutes. 


and he 


b cont Was escorted by an assistant to a lounge 


Glenn card tables, cards, books, magazines, and 
in the qun set. No other employees were allowed 
assistan, ounge during the testing sessions. A second 
CUssion | stationed in the lounge prevented any dis- 
given hee the manual's contents. The subjects Were 

o f his break at the request of their supervisors. 

elt that the absence of a break would create 


ne 


Ene: 
| juve attitudes. 
tesi p his break, each subject was given the recall 
| he : m he raised his hand, the test was collected. 
Tuestio bject was then given his manual and the first 
! and the on the look-up test. The time was recorded 
‘ Ment e subject was told to return to the experi- 
m" hen he had 


l Misheq desk for the next question w he h 
» ün i The process of getting a question, answ ering 

l four d getting the next one was repeated until all 
ti o, estions were completed. The start and finish 

bie Were recorded for each question. When the 

the ct completed the fourth question, he was given 
Attitude questionnaire. When he finished it, he 


| 
^. Was 
? thanked and dismissed. 


| T RESULTS 
| for = findings provide considerable support 
Ya C € hypothesis that the inclusion of ad- 
Ustra, Organizers, postquestions, concrete. il- 
| Nay ations, and delayed review in a technical 
?"al facilitates learning. Comparison of the 
Contr pem test scores of experimental and 
Was 9' groups indicated that the difference 
anq Significant at the .01 level (see Tables 1 
| CCo; ). Moreover, .41 of the variance was 
Ets $ nted for, suggesting that the strength of 
| Man, : ationship between the character of the 
Nal and recall test scores is considerable. 
Dort © findings also provide substantial sup- 
advance the hypothesis that the presence of 
ce organizers, postquestions, concrete 


h of 4 questions w 


illustrations, and delayed review enhances the 
manual’s effectiveness as a reference source. 
Analysis of the look-up test scores (see Tables 
1 and 2) indicated that subjects who used the 
experimental version of the manual found 
more of the information required, on the 
average, than subjects who used the control 
version (p < .01). The variance accounted 


TABLE 2 
ANALYSIS OF VARIA or RECALL Test SCORES, 
Look-Up T SCOR 
Loox-Up Test T 


Source df MS | I | e 
Recall test scores 
Version (V) 1 | 342.2 | 26.47** | 4i 
Experience (E) 1 | 13.2 1.02 
VXE | 1 0.1 0.01 
Error 36 | 12.9 
bi Look-up test scores 
a wd T "P sei | š | 
Version (V) 1 25.6 1816** | 32 
Experience (E) | 1 4.9 3.48 
VXE 1 2.5 ird 
Error 36 14 | | 
Look-up test times 
Version (V) i | 2756 | 48» | .09 
Experience (E) | 1 24.0 | 0.42 
VXE 1 | 140.6 | 2.46 
Error 35 | 57.1 | | 
<p <.05. = cs 
** p « 01. 


360 


MEANS AND STANDARD DEVIATIONS OF THE RATINGS or E 


Netz C. KALT AND KATHERINE MERLO BARRETT 


TABL 


E3 


ACH CHARACTERISTIC 


Characteristic 


ae | | ; 
Condition | | . | Use for learning | |. » for reference 
Format Organization Clarity a new job Use 
Experimental | 
Most experienced | Me 58 
1j Be 5.0 49 ! 5.6 5.8 1 
| te z | 7 | 0.4 0.4 
SD | 0.5 0.5 | 0. | 
t experienced EA 
= experien | we 5. 60 | 6.0 e 
SD | 0.5 0.5 | 0.0 | 0.0 : 
"Total | ý 
X 54 53 | 3 59 5.8 
SD 0.6 0.6 | 0.5 | 0.3 0.5 
Control | 
Most experienced | - 
Dg 4.7 4.6 | 5.1 | 5.1 5.2 
SD 0.5 0.7 | 0.8 | 0.9 0.8 
Least experienced | | ; 
A 3 3.9 41 | 3.2 3.8 
SD 0.7 0.7 | 0.7 | 1.0 1.0 
Total | a 
e ae 4.3 4.6 42 w 
D .8 1 | ? e 
S. | 0.8 | 0.9 14 C" 
" " i 3 a kz Age E E RN = «at On 
Note. ^ 6-point rating scale was used and scored with the lowest rating given a score of 1 and the highest rating a score of 
for by the version 


10), 

On the meas 
manual, the e 
manual's forma 


for learning a ne 
more favorably than t 
-01). Moreover, the p 
accounted for in the rat 
istic (e ranged from .3 
the strength of the rel 
version of the manual 


Xp 


t, Organization 
w job 


mental grou 
in the cont 


experienced rA 


2.7; FS 512, dj = 1/36 


nd inexpe 


of the manual was .32. Sub- 


p also took Jess 
rol Broup to find 
5,2 = -09), 
group also Spent 
ne contro] 


34, inexperienced x 


Toportion o 


P S s. ës 


$ 
that manual is con 


Siderable (mean na. 
ble 3). The interaction d 
manual and the leve ters 
atings of each charac Wa 
ficant (p < 01, e Wei 
S8esting that the impac d 
ues was moderated by J! 


ple 
both versions nts 
nded to rate the exp 


: thi 
mental version Somewhat more favorably 


the control, 


Discussron 


tan 
The Tesults of this study provide subs 


= res” 
tial support for the hypotheses that the fon ] 
ence of advance organizers, postque’”. ja 


j . 1 
Concrete illustrations, and delayed es a 
technical manual facilitates learning a E 
ances th Manual’s effectiveness aS find. 


prence source, One explanation of tnea itated 
ings is that each of the techniques fac 
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learning. Advance organizers identified the 
important information in the manual, thereby 
solving the subject’s problem of what to at- 
tend to. Postquestions and delayed review led 
the subject to go over that information, in- 
creasing the likelihood that it would be re- 
membered. And concrete illustrations made 
learning easier by relating new terms and 
Concepts to terms and concepts already in the 
Subject’s repertoire. The identification of im- 
p information, the provision of review, 
M re tying of new terms and concepts to 
D illustrations also made it easier to 
ed and understand needed information, 
fan y increasing the effectiveness of the 
5 ua) as a reference source. 
W hile this explanation is appealing, the 
ANN of the present study to isolate the 
a of each technique makes it less than 
She Usive. For example, it’s also possible that 
Spon two, or three of the techniques were re- 
sible for most of the variance accounted 
Or in recall test scores, look-up test scores 
m times, and ratings. Accordingly. research 
oe to determine the separate and inter- 
5 re effects of the several techniques needs 
be carried out before any conclusions can 
© reached. 
5 impact of the facilitators on the atti- 
on €s of engineers with little experience and 
P ner look-up test scores and times sug- 
a that such engineers will not make effec- 
Stine of a technical manual as a reference 
i Ce unless some, if not all, of these tech- 
x‘ Eus are included in that manual. Experi- 
like engineers, on the other hand, seem 
"ly to use a manual effectively whether or 
en Acllitators are present. It would appear, 
i inte engineers with much experience are 
Droce e enough with _the terminology an 
tion 3 ures of engineering and the gres. 
ae 9 feel comfortable with either version 9 
b js al, whereas engineers with little ex- 
amn ce have yet to acquire this kind of 
Miliarity, 
moh ile the results of this study suggest that 
asig a cious use of facilitators can inerea 
Man erably the effectiveness of a technica 
n m "al, much research is needed in order to 
te ae clear the conditions under which this 
tha, "Ship does and does not hold. It seems 
Tesearch in this area ought to look at 


the impact of facilitators in different techni- 
cal manuals, examine the effects of different 
operational definitions of each facilitator, use 
job trainees rather than job occupants (since 
job experience may attenuate the difference in 
learning between experimental and control 
groups), obtain measures of job performance 
as well as recall test scores and look-up test 
scores, and, again, determine the separate and 
interactive effects of the several facilitators. 
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There is some evidence that an observer's race 
affects the behavior of the Person being observed 


(see Ledvinka, 1972; Sattler, 1970). This evi. 
dence has implications for pers 
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hite institution, then 
job seeker should 
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job-leaving reasons are often viewed b: 


given by black job seekers to 
ers at a state employment service. 
involved a rejection of the job 
by the employer than did white 
y state employment 
T's credentials, it is possible that 
racial artifact of the interviewing 


rejection of the Worker by the employer. Er. 
larly, all reasons for quitting that had to do 7 for 
the person’s life off the job and all reasons ei 
dismissal that had to do with something ua 
lated to the worker or with anything he did 3 
categorized together as involving no FENCE 
Was predicted that black employment sae 
interviewers would elicit more rejection gt. 
and fewer nonrejection reasons from black J 
seekers than would white interviewers. 


METHOD ? 

Black job seekers in s 
Offices of the st 
midwestern cit 


y 
pecial poverty-prog 
ate employment service in a a 
Y were assigned in 1968 0 ter- 
of two white and two black employment ‘tet 
viewers, The assignment of job seekers to in 
viewers approximated random sampling. is- 
interviewers, all females, were the regular m 
tration interviewers employed by the emi by 
eir selection was deren 
field setting and by 
ate in the research. 


; varie 
terview covered a V 


Y Would ask the job seeker 


y Jeavin 
experience and his reasons for 
Previous jobs, 


nall 
a Sm 
ere recorded by means of 


der 
d cassette recorder placed ped i 
er's desk. Interviewers were 25 wou 
r interviews as they normally refuse 
jo Seekers the opportunity to 


attery-onerate, 
the interview, 
conduct thej 
and to offer 
a DR 


` earlie 
3 The method is described more fully in an 
report (Ledvinka, 1971). 
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pow es interviews recorded. The research 
Aras E. amed to the interviewers, although it 
leavin explained to them that reasons for job 
g would be examined. 
pe putedere yielded a total of 68 tape- 
Were lo registration interviews. Five of these 
E Von before they could be analyzed for this 
tioned , and no reason for job leaving was men- 
Seeker —- 23 of them, either because the job 
interyi ad never been employed or because the 
rst cu did not ask for a reason during the 
Of edt hour (half-hour cassettes were used). 
Were f, job seekers left, 25 were males and 15 
a a with a mean age of 31.6 years, and 
4 m educational level of 10.2 years. Of the 
Single “i marital status was indicated, 12 were 
Bae 1 married, and 11 divorced or separated. 
Viewers of the 40 were seen by black inter- 
Eis mM 24 by white interviewers. None was 
iem full time at the time of the interview. 
DEM analysis indicated no differences jn 
Brune oe between the 40 subjects used in the 
Teason A the 23 who gave no job-leaving 
actorial b 2.25). Similarly, unweighted-means 
erences analyses of variance indicated no dif- 
Ng In age and education between used and 
Subject subjects and no differential selection of 
ction pa black and white interviewers (inter- 
Actor etween the used and unused subject 
Tepard and the interviewer-race factor) with 
i to age and education of subjects (p > .25 
all cases), 
as a researcher categorized job-leaving reasons 
New rejection or nonrejection. Although the 
Drei cher had examined the interviews two years 
Bian for the purpose of testing other hy- 
MG the two categories—rejection and non- 
t lon— were defined on the basis of interviews 
“sed in the present research. To determine 
kar lig of the category definitions, all 
ive e $ for job leaving were presented to the 
as ; olleagues and graduate students who served 
Te Judges; None of the five was aware of the 
question. Judges were given the defini- 


“Search 

ion 

~_ is A ess gy H " 

j of "rejection" and “nonrejection” noted 


[ES 


Th 
irty " 4 ed i 
tar, "lly-seven of these interviews were used in 


tlie 
I res 

Big, arch on interviewer-race effects. The re- 
Carlig 31 interviews could not be used in the 


T re - 
Fesgay i teh for reasons unrelated to the present 


n 
dug, (e earlier research, interviews Were also con- 
“ploy M the youth-program offices of the state 
a ee service. Those interviews were not used 
ther Present research’ because the interviewers 


n u a 
Carlier in did not ask the job seeker why he left 


TABLE 1 


AND NONREJECTION (NR) Reasons 
BY BLACK JOB SEEKERS 


REJECTION. (R) 


Job seekers 


Interviewers Total | Male Female 
|. ] HE. | k jp 
| R [NR R|NR| R | NR 

— —— | 
Black iv] as) Poy RLS 
White + 20 3 12 1 8 


above.® with no scoring examples and no mention 
of race. Judges were also asked not to “read in” 
the motivations underlying any of the reasons. 
Reliability (Spearman-Brown r) was .98 when 
corrected for frame of reference (Winer, 1962) 
and .91 when uncorrected. 


RESULTS 


Since 16 job seekers stated only one job- 
leaving reason, the first reason offered by each 
job seeker was selected for analysis, in order that 
all job seekers would contribute equally to the 
analysis. Table 1 presents the distribution of 
those reasons, classified by race of interviewer 
and sex of job seeker. Black job seekers did offer 
more rejection reasons and fewer nonrejection 
reasons to black interviewers than they did to 
white interviewers ( = .002, one-tailed test, com- 
puted by the Fisher exact probability formula; 
for males, p = .005 and for females, p = .143). 

Of the 24 job seekers who did state two or 
more job-leaving reasons (for either the same 
job or different jobs), 12 offered both intrinsic 
and extrinsic reasons. Reasons after the first one 
given by the job seeker tended to increase the 
p values above; thus, for the job seeker's modal 
reason (the category he chose the most times), 
excluding ties, p = .009 with 2 — 34 (for males, 

= 013 with n = 22 and for females, p — .363 
with » = 12) and for the job-seeker's last reason 
given, p = .067 for all 40 subjects (for males, 
p.029 and for females, p = .664). 


Discussion 


The sex differences in p values, along with the 
fact that only female interviewers were used, 
raise the possibility that sex effects as well as 
race effects operated in the interviews. Thus, 


In presenting category definitions to the judges, 
rejection was labeled “intrinsic,” and nonrejection 
was labeled “extrinsic.” 
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further research might vary race and sex of both 
interviewer and job seeker. Moreover, the find- 
ings on reasons after the first one given suggest 
that the effects of the interviewer's race decrease 
as the interview progresses. 

The results constit 


fact in the employment interview, 


asons constitute important infor- 
oyment service interviewers, Em- 
Ployers’ insistence on “stability” in the low- 

j ferred by employment 
ell known. Consequently, 
looked upon as part of 
ntials or qualifications, 


job-leaving reasons are 
the job seeker’s crede 


SHort Notes 


Thus, the racial artifacts found here could a 
considerable bearing on the low-income blac 
Prospects in the world of work. 
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JOB INTERVIEW TRAINING WITH REHABILITATION CLIENTS: 
A COMPARISON OF VIDEOTAPE AND 
ROLE-PLAYING PROCEDURES! 
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University of New Mexico 


Nineteen clients at a rehab; 
three treatments designed to improve 


ilitation center were randomly assigned to one of 


job interview behavior. Judges’ ratings 


indicated that subjects in a videotape-feedback condition and those in a role- 


playing con 
placebo control group but that the 
from each other. 


‘eves the use of behavior therapy tech- 
E Such as modeling and role playing, has 
titio advocated by substantial numbers of prac- 
ners and theorists (Bandura, 1969; Yates, 
ign? the use and implementation of these tech- 
CT. have generally been restricted to popula- 
Psy Add college students or clients referred to 
tate} ‘ologists for therapy. These techniques have 
cally. been employed with populations of physi- 
RA or mentally handicapped clients, such as 
(i attending a rehabilitation center, even 
ee, many of these clients are obviously defi- 
ce in the target social skills that such pro- 
skin’ are designed to train. One such important 
in —appropriate interview behavior—which was 
vestigated in the present study has generally 
cen ignored by the rehabilitation counselor in 
-’Vor of the counselor assuming the responsibil- 
Y for actual job placement. 
be Vhile a variety of different procedures could 
Te Used to train job interview behavior, research 
ke ding the efficacy of various methods has been 
oral with the literature containing no re- 
ap ts of comparisons among different treatment 
opa aches. Prazak (1969) has, however, devel- 
Pg an assessment model of five critical areas 
0 interview behavior for a rehabilitation center 
i Pulation, These five areas include the follow- 
prop, bllity to explain skills. ability to answer 
Man em. questions, appropriate appearance an 
he Derisms, enthusiasm, and opening and closing 
Ado Interview, This was the assessment model 
—OPted for use in the present investigation. 
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dition improved significantly more than those in an attention- 


two experimental groups did not differ 


The present study was designed to investigate 
the effects of two methods—videotape and role 
playing—in implementing a job-interviewing skill 
program for rehabilitation clients. A videotape 
procedure was selected because research suggests 
its efficacy in learning and behavioral change 
(Eisenberg & Delaney, 1970; Wilmer, 1967). 
Role playing was selected because this method 
has been employed as a modeling procedure to 
implement behavioral change (Miller & Dollard, 
1941) and because it is economically more feasi- 
ble than videotape equipment. An attention- 
placebo group served as a control measure. 
Judges’ ratings of videotaped mock employment 
interviews served as the dependent measures. It 
was hypothesized that subjects in the two treat- 
ment groups would evidence significantly im- 
proved interview behavior following treatment 
and that the control group would evidence no 
change in interview behavior. 


METHOD 


Subjects 

The subjects were rehabilitation clients referred 
by a state division of vocational rehabilitation 
and a nonprofit rehabilitation center located in 
the Southwest. Subjects were selected by their 
suspected lack of interviewing skills, previous 
unsuccessful attempts to obtain employment, 
and/or readiness to begin actively seeking em- 
ployment. Although 30 subjects were initially 
selected to participate, only 19 subjects (8 males 
and 11 females) ranging in age from 15 to 55 
years old completed the program. Their disabili- 
ties included physical, emotional, and/or mental 
handicaps. Previous job interview experiences 
ranged from none to many. 


Experimental Conditions 


Subjects were randomly assigned to one of 
three groups which met for two consecutive days 
for a total of 10 hours. The format for all groups 
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consisted of each subject completing an applica- 
tion for employment form; being pretested with 
a five- to ten-minute videotaped mock employ- 
ment interview; and being Posttested with an- 
other videotaped mock employment interview at 
the conclusion of the treatment phase, 
Following the Pretesting, the six s 


Was assisted in developing a 
proach, in exploring his own potential, and jn 
formulating a Positive approach to the explora- 
tion of his own problems. Subsequent to this, the 
videotaped-pretest interviews were shown to the 
Subjects as a feedback mechanism. The subjects 
were urged to comment not only upon their own 
performance but also upon the Performance of 
other group members, After completion of this 


phase, a brief review Was presented and the final 
interview was taped. 


The five subjects in th 


excerpts relevant to the 


groups; howey 
critical aspects 
the initial taped 


ention effect, 
tours was deviseq 


participation time 
comparable to that of the two treatment groups 


SHort NoTES 


Ratings 


Ratings by trained, impartial judges of E j 
pre- and posttraining-videotaped employment E 
terviews served as the criterion measures. Att E 
time of rating each videotape, no w—— 
was imparted to the judges regarding either of 
treatment rendered or whether it was a pretest E 
posttest interview, An objective behavioral e 
scale that was Constructed to measure 26 speci i- 
critical interview behaviors using a 5-point dan 
fication range was employed by the judges 
their assessments of treatment effects. p 
rated each subject for a grand total of 95 in E 
vidual ratings to be analyzed. The rated behavio 3 
were those that were deemed mandatory E 
highly desirable for interviewing success, re 
grooming, courtesy, keeping answers short “ne 
Positive, etc, In addition, each judge rated Ww. 
overall behavioral change between the two inte 


: € signifi- 
views on a 7-point scale that ranged from sign 
cantly 1Mproved to much worse. 


REsULTS 
The reliability o 


*nos Was 
Í the five judges’ ratings W 
measured by 


; Eas rd- 
calculating coefficients of conco 


. -83 for the pore 
Core, and .85 for the subjecti” 
erall change or improvement. The fae 
these ratings was considered suf 
ciently high to Warrant further analysis. t 
The composite Score on the 26.item £e 
interview rating scale was analyzed by an anà he 
Sis of covariance, using pretest scores as ait 
covariate. The analysis revealed a significant A ] 
ference among the groups (P = 47.7, df =2 fe’s 
b<.01); a priori Comparisons using Scheffe 
method revealed that the adjusted mean so 
of the videotape (¥=91.74) and role-play 
(X = 92.60) groups were significantly Lm 
the contro] group (X — 61.32. Vit 
955, df — 1/1 4 < 01) but that the two trot, 
iffer significantly from €? 


test total rating s 
rating of oy 
liability of 


other, all 
r a 

, An analysis of Variance of the ratings sh qacan 

Improvement Similarly revealed a signi 01): 

treatment effect (F 


— 743, df = 2/16, P<: 
ns using Scheffe's pu 
mean scores of the BR 
and role-playing groups (xX p 
significantly higher than those 
(X= 39, F= 146.9, (A 
hat they did not differ fro? 


dure 
tape 


a priori Compariso; 
revealed that the 
(¥ = 6.13) p e 
1/16: 


ont 


SHORT NOTES 


A second direct measure of improvement was 
the actual gain scores on the composite behav- 
loral rating; ¢ tests revealed that subjects in the 
Videotape group (¢= 7.9, df=5, p< .01) and 
the role-playing group (t = 9.90, df = 4, $ < 01) 
both evidenced significant improvement in their 
Interview behaviors, whereas those in the control 
group did not (¢=.21, df — 7, p > -10)- 


Discussion 


The results of this study suggest that both the 
pr of videotape interviews for modeling and 
i bark and the use of role playing as a job 
nterview training procedure can produce signifi- 
cant improvement in interview skills as com- 
Pared with previous performance and with an 
titention-placebo control treatment. While video- 
a is being used increasingly today for teaching 
ous interpersonal skills, such as interviewing, 
ES Savings in equipment expense that would 

Sult from employing the equally effective pro- 
cedure of role playing might suggest that role 
Playing be the treatment of choice. It is possible, 
MC that the direct feedback produced by 

ewing one's own videotaped performance may 

€ superior to the indirect evaluation obtained 
Tom the reaction of others in ways that were 
ROG assessed in this study. It was observed, for 
"stance, that several subjects in the role-playing 
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group initially resisted the possibility of their 

having interview problems, whereas those in the 

videotape group were quick to detect their own 
inappropriate behaviors and acknowledge the 
need for improvement. A larger number of sub- 
jects and a longer follow-up period would be 
necessary to ascertain whether these two treat- 
ments are appropriate for a wider population, 
whether they produce enduring effects, and 
whether they differ from each other in the 
changes they produce. 
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A DISCRIMINANT ANALYSIS OF ORGANIZATIONAL 
PERFORMANCE VARIABLES 


HERBERT H. HAND 1 AnD WILLIAM R. LAFOLLETTE # 


Graduate School of Business, 


Ninety-eight groups Participating in a gamin; 


senting a company) were divide 
investment, A discri 0 
ing hypothesis: (1) Return on 


The purpose of this research js to quantita- 
tively evaluate performance criteria when the 
firm is the unit of study. Specifica] 
criminate analysis is used in an at 


"return on investment? 
criminating levels of perífo 
performance variables 
directly or indi 


riminator Variables will divide 
pings as return on 
Investment would 


- and low 


as utilized 
enrolled in 
Drise course 
he perfor- 


t 16-week simulati 


; E ness enter 
ita, Major mid e 
El 


Western uni 

* Requests. for reprin 
iot displayed should 
vao is now at 
ion, University 
'arolina 29208, 

? Now at Ball State University. 


ts or for data referred to but 
be sent erbert 
the College of Usiness Administra. 
of South Carolina, Columbia, South 


Indiana University 


ypothesis 1 wil] remain constant over time. 


ntified as being 
formance were not shown to be 


mance variable: 
those variables 
mance in the 
highly dependen: 


s ance 
sured by a Composite of all the performa 
criteria, i performance 


was International Operations ane 
interactive business gang 
; & Howells, 1964). The a 
focus of the course, and A 
Criterion of came 
on of the performan 

» the continual feedback on absolute er 
a relatively long tim 


2 ith an 
able isomorphism with ¢ 
Tating Organization, 


kly 
© performance variables calculated a dni 
d as potential discrim 
the following: 


Percentage of: 


Cash to sales, 
Supplier credit to sales, 
inventory Costs to Sales, 
net earnings to sales, 
product Stockout. 


ree ock- 
S to the opportunities to st 
out 


] 
Percentage deviation: nses; 
i N ; «perse? 

from optimum for sales distribution exl 


Percentage deviation of: 


actual from forecasted sales, osts, 
actual from forecasted unit variable costs; 
actual from forecasted net earnings, 


i are 
actual from forecasted operating à 
alances, 


a cash 
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SHORT Notes 


actual from forecasted home office cash bal- 
ance, and 
actual from forecasted return on investment. 


Fe data were collected for Quarters 5 through 
in the simulation. For purposes of this 
research, only data from Quarters 6 through 11 
Were utilized in an attempt to eliminate possible 
Start up” and “shut down” biases. 
Nu discriminant analysis for several groups 
es utilized for data analysis. The sample of 98 
on Ups was divided into quartiles by actual return 
(e investment. The discriminant analysis at- 
empted to predict group membership of the 12 
M nint variables based on the preceding 
artile classification of return on investment, the 
“pendent measure. The quartiles were identified 


4S the high-high, high-low, low-high, and low-. 


A end were composed of 25, 25, 24, and 24 

vati ps, respectively. The variables in each obser- 

Re period were statistically normalized in 

( T (a) to permit comparability of the data, 
to reduce scaling errors due to the presence 

in different parameters in the various sections 

anal e simulation course, and (c) to simplify the 
Ysis of a discriminant function. 


RESULTS 


p first set of results are related to working 
ape thesis 1, that is, the 12 discriminator vari- 
es will divide the sample into the same group- 
E the dependent variables, return on invest- 
M Data from each of six separate time periods 
eg eed and arranged in the form of a 
ata f cation matrix for each of the periods. The 
" or Quarter 9 are presented as an example of 
Mota format. Each matrix summarizes the 
EE of the estimation of the four probability 
ity functions from the four appropriate 


TABLE 1 


QUARTER 9 
Crassiricatton Matrix Urrnizixo 12 
DISCRIMINANT PREDICTORS 


turn on inves Discriminant function? % 
investment |__| group 
Erouping hits 
1 2 3 4 | Total 
High-ni z 
Hi; gh 4 0 25 96 
Ligh-low aa) a4] 8| 0| 25 | 36 
Low high o| 3|19| 2| 24 79 
Tour” 0| 1| 4] 19] 24 79 
% füncti 30 | 19 | 28 | 21 | 98 
unction hits 80 | 73 | 68 | 90 


dy te, E 
E 36 Mahalanobis D? = 322.023 can be used asa chi-square, 


The 
Broups{ hYPothesis that the mean values are equal in all four 


STejected with p < .o01. 
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TABLE 2 


COEFFICIENTS FOR Net EARNINGS/SALES CROSS- 
CLASSIFIED BY FUNCTION AND QUARTER 


Quarter Function 


1.02 
3.19 


[SCIT ET 


Nee 
ea 
[ed 


discriminant functions. The discriminant func- 
tions assign each of the 98 teams to the group 
for which the probability density is greatest. 
Each team’s response vector is compared to the 
mean response vector for each group. The team 
is then assigned to the group where the difference 
is smallest. If each of the four group functions 
were perfect discriminators, each numeric entry 
in the principal diagonal of the matrix would 
equal the row total with all other entries being 
equal to zero (Wood, 1971). The results of the 


analysis indicate that in each of the six time 
periods the null hypothesis was rejected 
(p € .001). 


The classification matrices for Quarters 6 
through 11 indicated a somewhat consistent per- 
centage of group hits. The percentage of accu- 
rate placements ranged from 76% to 96% for the 
high-high grouping; from 44% to 76% for the 
high-low group; from 54% to 79% for the low- 
high group; and from 71% to 79% for the low- 
low group. It would appear that the high-high- 
and the low-low-group-placement predictions are 
not only most accurate, but also most stable. The 
pattern of function hits closely parallels that of 
the group hits. The second set of results are 
related to working Hypothesis 2, that is, the co- 
efficients associated with the various discriminator 
variables will remain relatively constant over time 
for high- and low-performing teams. In analyz- 
ing these data, tables were constructed by select- 
ing the coefficient relating to each of the 12 dis- 
criminator variables for the appropriate function 
and quarter. The net earnings/sales coefficients 
in Table 2 are an example of these compilations. 

Function 1 represents the high-high group and 
Function 4 represents the low-low group. An 
attempt to synthesize the 12 tables is presented 
in Table 3. Each cell in the matrix has multiple 
entries of +s and/or —s. The entry refers to a 
coefficient > .30. The actual + or — refers to 
its directionality. The sequencing of the cell entry 
corresponds to the timing of its entry. Table 3 
reports generalized findings with respect to the 
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TABLE 3 
DISCRIMINANT FUNCTION CORRELATIONS > +30 
Item ‘ 2: á 

<u 
Cash/sales MA ym —,— 
Supplier credit/sales = —,— —,— 
Inventory cost/sales SS 4,- Tu- — 
Net earnings/sales eb t+4,4,— Ar 
Stockouts/opportunities =, —,— = 
Sales expenses/optimum +,+,—,—,— = ism a 
Sales forecast error "c T d 
Unit cost forecast error —,—,— To hex 
Operational earnings forecast 

error teh + =r yh dci 
Arca cash forecast error tod + enc: s 
Home office cash forecast error =H, =, +, +, + FE soot 
Return on investment forecast 

error 


second working hypothe 
indicate that only the net e 
cients indicate a very strong and Consistent rela- 
tionship, that is high net earni i 


, 


to high return on investment 


extremel 
function 
tudinal stabilit 
cash/sales, su 


Y effective discriminator across the : 
© to simultaneously have d 
Y across time, It may be noted E, 
pplier credit/sales, inventory C05 


Discussion 


sales, unit variable cost forecast error, and retur? 
on investment forecast error are components id 
à manager's job which he would attempt i 
minimize. A test of Hypothesis 2 indicated th 
the high-high-performing teams do, in fact, te? 


to minimize 
the low-low. 
the identical 


CONCLUSIONS 


2 A shereas 
these variables over time pare 
"Performing teams did not mini 
variables, 


stability of each 
While a rigorou 


Only 1 of the 12 discriminator 
earnings/sales) appeared not on 


re with which Si 
An orga. Wei 


ntified with 


The results of this Study indicate that 5rd 
mulation gaming environment in which equ? 

ighted multiple-performance variables exist 
return on investment seems to conceal a num 
of interesting and Complex phenomena that oot 
among the other 12 performance indicators. H af 
ever, when Hypothesis 2 was examined in b 


Were cast on the b 
Hypothesis 1, Tt is suggeste quate 
research is needed (a) to e prat 
i istencies may be due to "m 

n to unforeseen condi” Gr 
rcumstances in which 1 "ervent 
ance variables may logically in i 


i igate 
between the simpler relationships investi& 
this research, 
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Graduate School of Business and Industrial Relations Research 


University of Wisconsin—M adison 


the inventories. Results showed 


Institute 
istortion of forced-choice personality inven- i 
up was first hired and then requested to take 


no significant mean scale differences between 
groups. Certain scales were signifi y 


mental group but not in the cont; 
significantly moderated by the i 


Although forced-choice personality inventories 
are designed to reduce response distortion. or 
faking, voluminous evidence exists Suggesting that 
many forced-choice instruments can be distorted 
When subjects are instructed to do so (Waters, 
1965). Yet the typical Study using students, 
repeated-measure designs, and Various forms of 


instructions to distort says little about whether 
individuals actually distort their responses when 
the instruments are being us 


| ed for decision- 
making purposes such as empl 


s oyee selection, A 
more appropriate procedure in these situations 
would be to e 


nd thus enhance 
(Schwab, 1971). 


Only three 1957; Kirchner. 
1962; V, n Gordon, 1963) 
of forced-choice 
employees in a 
Unfortunately, all 


study and H. G, Heneman lI 
comments on an earlier draft oj thi: 
* Requests for reprints should be se 
sent t 
P. Schwab, Graduate School of Business, Gene 
of Wisconsin, 1155 Observatory Drive, Madi 4 
Wisconsin, 53796, : il 
Publishing Co., 


® Now at Western P = 
New York, f oughkeepsie, 


5 article, 
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rol group; however, th 
nstructional set provided. 


is relationship was not 


applicants and employees besides the motivation 
to distort. A more appropriate design would ran? — — 
domly assign only applicants to distortion an 1 
honest conditions (Heron, 1956), f 

A final issue pertains to the implications 9 
being motivated to distort or actual distortion 
for the predictive validity of the instrumed 
Would, for example, an inventory be a less Mo 
indicator of subsequent employee behavior if p? 
applicants these employees distorted their T 
Sponses? This question has not been answer 
empirically, And, as Bartlett and O'Leary (196% 
em^nstrate, no necessary relationship exists i. d 
tween mean values obtained on predictor or C — — 
terion variables across different groups an nt 
validity coefficients to be expected. In the prese Ey 
context, for example, different mean values on zt , 
inventory in distortion- and honest-respo rd. 
groups would not necessarily mean that the Mer. 
ity of the instrument for predicting a crite 
was different for the two groups. Likewise, 2 Ja k 
of mean differences does not necessarily IMP 


N AES ictor- 
that the two groups will yield similar predict n 
criterion correlations, 


s t 


sored 
The present study was addressed to two pee | 
questions. First. it was aimed at properly ex? ctus g 
ing the question of whether job applicants noice | 
ally distort their responses on force BE 
P*rsonality inventories when seeking employ" the 
Second, the Study was designed to en eet 
Predictive validity of two forced-choice i 
tories—the Gordon Personal Inventory vithin 
and the Gordon Personal Profile Did iain as 
each treatment condition. Tenure bio "m was - 
the criterion, because the firm praem b 
experiencing a substantial degree © A 
among its labor force. : 


SHORT NOTES 


METHOD 
Sample 


The sample consisted of all persons hired to 
become small-parts assemblers during a three- 
and-one-half-month period by a medium-sized 
electrical manufacturing firm that employs ap- 
proximately 400 people. This procedure resulted 
in a total sample of 59 female subjects. The 
typical subject was married, had two children, 
was in her late 20s or early 30s, and possessed 
a high school education. 


Procedure 


3 All subjects were treated identically during the 
Initial phases of the hiring process. Specifically, 


at the time of application each subject (a) filled. 


out an application blank, (b) had a physical 
examination, and (c) was interviewed by the 
firm's personnel manager. At the conclusion of 
these steps, the personnel manager decided 
Whether or not to hire the applicant. Applicants 
Dot hired were told so at this time and were 
dismissed from further consideration for the 
Purposes of this study. 

Those applicants that the personnel manager 
decided to hire were then divided, on an alter- 
hating basis, into two groups. The experimental 
group (n = 29) was told by the personnel man- 
ager: “As the final step in our selection process, 
We request that you fill out these two inventories 
[the GPI and GPP]. Please read the instructions 
Carefully and answer all the questions." To give 
the impression that the inventory results were 
Used in the selection decision, the personnel 
Manager took the completed inventories to an- 
Other room, waited from 5 to 10 minutes, then 
returned and offered the subject a job. 

Members of the control group (2 = 30) were 

Offered a job by the personnel manager immedi- 
ately following the interview. As they were 
leaving the personnel ofüce after accepting the 
Job (all subjects in both groups accepted em- 
Ployment when offered), control subjects were 
asked by a personnel clerk to fill out the GPI 
and GPP as follows: 
We are cooperating with some researchers at the 
University of Wisconsin on a project to study several 
inventories designed to find out how people think 
about themselves. Your responses will be mailed 
directly to the University of Wisconsin and have 
nothing to do with your employment here at . . - 
[name of firm]. 


All subjects agreed to fill out the inventories. 
They were told, “please read the instructions 
Carefully and answer all the questions.” Three 
Subjects in the experimental group failed to fill 
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out the GPI properly and two control subjects 
failed to fill out the GPP properly. These 
responses were eliminated from the analysis. 

Employment records for all subjects in the 
study were examined six months after the initial 
hiring process. During that time, 33 (56%) of 
the subjects voluntarily left the firm, The dichot- 
omy—voluntary termination or retention—was 
the criterion employed to test the prediction issue 
investigated in the present study. 


Analysis 


The significance of mean differences between ex- 
perimental and control groups on all GPI and 
GPP scales was examined to determine whether 
the subjects who were led to believe that the 
inventories were used in the selection decision 
differed from those who were not. Correlation 
coefficients between scale scores in each treatment 
condition with turnover were calculated to aid in 
analyzing the second major problem considered. 
The significance of the differences between corre- 
lation coefficients was calculated to determine 
whether the treatment condition served as a 
moderator of the tenure-inventories relationships. 


RESULTS 


The first part of the analysis examined the 
mean GPI and GPP scale scores of each sub- 
group. There were no significant mean differences 
between experimental and control groups on any 
scale of either inventory. Indeed, on only three 
of the eight scales were the differences in the 
direction generally thought to be more socially 
desirable (i.e., higher in the experimental group). 

The second part of the analysis examined the 
relationships between voluntary turnover and 
GPI and GPP scale scores for each treatment 
condition. There was a significant (f < .05) nega- 
tive relationship between turnover and the vigor 
scale of the GPI in the experimental group 
(r = —.47). On the GPP, ascendancy and soci- 
ability scores were positively and significantly 
(p< .05) related to turnover in the experimental 
group (r = .43 and .38, respectively). However, 
none of the correlation coefficients were signifi- 
cant in the control group. None of the z tests 
of differences in zero-order correlation coefficients 
between the two groups were significant.* 


Discussion 


Previous research by Bass (1957) and Van 
Buskirk (reported in Gordon, 1963) has shown 


4A table reporting mean scale values for each 
subgroup as well as the correlation between scale 
scores and turnover for each group i zai 
p is availa r 
the frst author. Ebie tag 
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that job applicants score differently than employ- 
ees on all scales of the GPP. This has been in- 
terpreted to mean that job applicants deliberately 
distort their responses in an employment con- 
text. No such evidence, however, was found in the 
Present study when job applicants were randomly 


ol conditions, 


evidence that they were 
actually distorted by applicants when seeking em- 
ployment, 


In a very crucial respect, however, whether 
job applicants distort res 
in mean scores on the inventory 
tant as the relations bety 
criterion behavior of in 


es the predictor-criterion 
erating effect 
control condition 
Were not significant in 
theless, there was a tendenc 
and GPP to Predict turnoy, 
in the experimenta] group. 


the present study shows no evi- 


ere distorted by 
differences with 


Sort Notes 


ditions. Moreover, the inventories were at least 
as valid predictors in experimental as in control 
conditions. While these results should not be 
generalized beyond the type of applicant studied, 
the inventories used and the criterion predicted, 
they do strongly suggest that the conclusions ob- 
tained from distortion research on students and 
even applicants versus employees apparently tell 
us little about what to expect in employee 
selection and Predictive validation studies. 
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CROSS-CULTURAL DIFFERENCES IN TWO-FACTOR 
MOTIVATION THEORY 


GEORGE H. HINES : 
Victoria University of Wellington, New Zealand 


Herzberg’s two-factor motivation theory was tested in New Zealand using 
ratings of 12 job factors and overall job satisfaction obtained from 218 middle 
managers and 196 salaried employees. Contrary to dichotomous motivator- 
hygiene predictions, results revealed supervision and interpersonal relationships 
to be ranked highly by those with high job satisfaction and strong agreement 
between satisfied managers and salaried employees in the relative importance 
of job factors. Findings were interpreted in terms of social and employment 


conditions in New Zealand. 


F Among the most internationally accepted theo- 
ries of work motivation in recent years has been 
Herzberg’s (Herzberg, Mausner, & Snyderman, 
1959) motivator-hygiene mode, which holds that 
Job satisfaction is a function of challenging work 
activities (motivator factors), while job dissatis- 
faction is a function of extrinsic variables like 
salary, working conditions, and supervision (hy- 
Slene factors). Despite the influence of this for- 
mulation, numerous studies have failed to pro- 
Vide unequivocal support. Some researchers have 
found situations in which hygiene factors were 
associated with job satisfaction (e.g, Ewen, 
1964; Wernimont, 1966), while others (eg. 
Armstrong, 1971) report that job factor impor- 
tance is linked with occupational level. Dunnette, 
Campbell, and Hakel (1967) believe this theory 
to be grossly oversimplified; they conclude that 
Satisfaction or dissatisfaction can reside in the 
Job content, or job context, or both jointly. 
Some studies in New Zealand have expressed 
reservations about the universal applicability of 
the Herzberg model. Thus Cant and Woods 
identified a “man-management” factor and con- 
tended that the human relations aspects of the 
Job were slightly more important than technical 
elements, Griew and Philipp (1969) reported that 
Job factor ratings of New Zealanders were “very 
different from those of workers overseas [p. 61].” 
‘hile Mosley (1969) found supervision to be a 
Critical factor causing dissatisfaction, Watson 
(1971) relatéd supervision and interpersonal rela- 
tionships to job satisfaction and Hines (1972) 
indicated the importance of status as both a 
Motivator and hygiene factor. None of the New 
caland studies offered any detailed reconciliation 
Of the differences between their findings and the 
Herzberg theory, 
E i 
i; Requests for reprints should be sent to George H. 
nu Victoria University of Wellington, P.O. Box 
g €llington, New Zealand. 


The overwhelming majority of investigations 
of the theory have been conducted in North 
America and Europe, where the work environ- 
ment differs from New Zealand conditions in 
several important ways. Three factors—full em- 
ployment, relatively small companies, and an 
egalitarian ethos—influence the motivation to 
work in that New Zealanders have high job secu- 
rity, interpersonal relationships tend to be more 
frequent, and the employer-employee contact is 
reputedly more personal, relaxed, and friendly. As 
a consequence, it seems logical to expect that 
the factors which influence job satisfaction in 
New Zealand might differ from those in countries 
with dissimilar work environments. In accord- 
ance with Armstrong (1971), different motiva- 
tional factors should also be more salient for 
managers than for nonsupervisory staff. This 
study was designed to test the Herzberg theory 
in New Zealand through a comparison of job 
factor ratings and overall job satisfaction of satis- 
fied and dissatisfied middle managers and salaried 


employees. 
METHOD 


A questionnaire was developed to measure 12 
job satisfaction factors as well as overall job 
satisfaction. Respondents were asked to rate vari- 
ous aspects of their current job using a 7-point 
graphic rating scale adapted from Halpern 
(1966). The initial sample consisted of 480 
middle managers and 327 salaried employees 
from which groups of 144 satisfied and 74 dis- 
satisfied managers and 114 satisfied and 82 
dissatisfied salaried employees were drawn. 


RESULTS 


Table 1 presents the occupational level com- 
parison of job satisfaction scores. Three findings 
are readily evident: (a) satisfied managers and 
salaried employees both rate overall motivator 
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OCCUPATIONAL LEVEL COMPARISON o 


Notes 


F Jon SATISFACTION SCORES ron SATISFIED AND 
DISSATISFIED MANAGERS anp SALARIED 


Managers 
pati Satisfied Dissatisfied 
(n= 144) (n = 74) 
[ICON 
— 
Recognition 6.52 4.35 
Achievement 5,84 5.26 
Responsibility 6.48 3.52 
Work itself 6.29 5.66 
Advancement 5.44 4.72 
Growth 5.80 5.11 
Motivator (X,,) 6.06 411 
Interpersonal relationships 5.92 4.00 
Supervision 5.95 3.82 
Work conditions 5.04 5.01 
Company policy 4.69 4.12 
Status 5.23 4.98 
Salary 5.49 5.20 
Hygiene (X,) _ e 5.39 4.52 
Difference (Xn — Xy) 67 25 
*p <05, two-tailed test, 
and hygiene factors significantly higher (p .05) 
than do their dissatisfied counterparts. (b) over- 
Motivator factors (X. are not rated signif 
ficantly higher than Overall hygiene factors (X,) 
1n any of the Our gri ups; and (c) four job 
actors (recognition, responsibility, interpersonal 
Telationshj 


PS, and Supervision) are r 


Pearman tank-order 


n Correlation Coefficient 
) test indicated hat f 


or managers as Well as 


giene factors fo e 
Personnel, Thus f, 
Vator factors made no 
Satisfaction than qi 


LUE 9 perceive t 
jources of their job Satisfaction in a ng 
ashion, regardle: 


Discusstoy 


The results of this study Show tha 
me other Overseas findings, the He, 


EMPLOYEES 
Salaried employees 
i Satisfied Dissatisfied Difference 
Difference (a = 144) (n = 84) 
m TM, MTS TIAM - 
2.17* 6.35 4.77 1.58 
-58 5.32 4.98 ae 
2.96* 5.88 4.23 165* 
63 5.97 4.55 1.42 
72 4.83 4.59 24 
69 4.86 4.09 47 
1.29* 5.53 4.53 1.00* 
1.92* 6.22 3.37 2,85 
2.13* 6.01 2.95 3.06* 
%3 4.23 4.20 .03 
57 4.70 4.82 —.12 
25 4.82 4.45 37 
P. 5.24 4.90 34 
85 5.20 411 1.09* 
E 33 42 —.09 
appears to have validity across occupational lev- 
7S: In view of the unique features of New Zea- 
land industry, the & ared perception of job factor 
Importance by managers and salaried employees 
“ould not be jnexpected. The significance of 


is Strongly emphasized by the re- 
- This finding is consistent with the Cant 
and Woods ( 1968), Griew and Philipp (1969), 


research and highlights the 


sonal associations, 


ere are a number 
of this Tesearch for New 


thout reservation. From the 
erences in New Zealand which havi 
‘tions stating that the tfaining a 
is Currently inadequate and must 


improved, it seems likely that there has Been x 
e.erConcentration on Herzberg-defined motiva a 
actors t © exclusion of the equally relevant 
element. Supervision and interpersonal EE. 
tionships e Present research would seem d- 
indicate th efforts at job enrichment often a ^ 
Vocated by €rzberg should be devoted to recog 


nition and responsibility but with simultaneous 
stress on the improvement of the climate for good 
interpersonal relationships and the upgrading of 
supervisory skills. The study also illustrates the 
role of cultural influence on the Herzberg theory 
and the need to take cross-cultural differences 
into account in transplanting à motivation model 
internationally. 
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PRESSURE FOR PRODUCTION, TASK DIFFICULTY, AND THE CORRELATION 
BETWEEN JOB SATISFACTION AND JOB PERFORMANCE 


ROBERT B. EWEN? 
New York University 


The hypothesis sug; 
the correlation betw 


Industrial and organizational psychologists are 
well aware that satisfied employees do not neces: 
sarily perform better on the job than those who 
are dissatisfied. For example, Vroom ( 1964) 
extensively reviewed the literature and found a 
median correlation between job satisfaction and 
job performance of only .14, indicating that there 


is no simple relationship between these two 
variables, 


Many different e 
have been offered. 


very 
employee compl. 


» may make job performance 
more important to the empl H j 
r ployees; or, 
ls fairly difficult, relativi Aes ew 
, 


range and 


1 Requests for reprints should be s 
B. Ewen, 8101 Camino Real, Apt, 
Florida 33143. 


ent to Robert 
C-317, Miami, 


] 
when researchers correlate 


The purpose of the present paper is to eon 
briefly on findings obtained in an educationa 


setting which are interpreted as supporting this 
theory. 


METHOD 
Subjects 


Two undergraduate classes in psychology at 
New York University were compared in this 
study. 

One class dealt with psychological statistics; 
the enrollment in this course consisted of 33 
students. All psychology majors at New York 
University are required to take statistics and 
earn a grade of C or better, which leads to a 
great deal of pressure. In fact, many students’ 
have experienced difficulty in the past with 
numerical concepts and procedures, and, there- 
fore, regard the course in statistics as the one 
major obstacle to graduation. 

The second class dealt with abnormal psy- 
chology; the enrollment in this course consisted 
of 86 students. This course is not required for 
the major in psychology at New York University 
and is of much interest to the students, so there 


is considerably less Pressure than there is in the 
statistics course, 


Both courses were 
tor (th 
Sures i 
Statisti 
cholo 


taught by the same instruc- 
e present Writer). On the day the mea- 
n this study were administered, 25 of the 


cs students and 67 of the abnormal psy- 
sy students were present. 


Measures 


Satisfaction With each course was measured hy 
an item on the anonymous student evaluation 
questionnaire, administered at the end of each 
Semester by the New York University Psychology 


378 


` 


Snort NOTES 


319 


TABLE 1 


MEANS, STANDARD DEVIATIONS, AND CORRE. 


LATIONS BETWEEN SATISFACTION 


wirH COURSE AND EXPECTED GRADE 


Satisfaction? Expected grade? Correlation between 
Course n satisfaction an! 
M SD M SD expected grade 
EE pé. —— 
| sene a ee 
Statistics 25 7.04 2.60 4.12 0.71 48 
Abnormal 67 7.16 1.62 4.44 0.63 05 
[5 = very high (favorable) and 1 = very low (unfavorable). 
5 =A (highest) and 1=F (lowest). 
Discussion 


which asked students to rate on 


Department,” 
would recom- 


a 9-point scale how highly they 
mend the course to another psy¢ 
While such a one-item measure might su 
low reliability, it was 


problem because they k 
affect departmental policy in p! 
semesters. At worst, à Type 2 € 
committed which would le 
with a more substantial number of items. 
Productivity was measure i 

evaluation form which asked the student to indi- 
cate the expected grade for the course. Since 
anonymity was guaranteed, 

compare satisfaction with actual grades, 
was not deemed serious because the 
each course had received CO i 
(such as examination grades 
the time of the administration 
naire and, therefore, ha n ex 
what the final grade would be. 


The statistical analysis consisted 
relating satisfaction with the 


indicate that satisfaction an 
were more highly : 
greater pressure for 
statistics course (¢ = 
one-tailed, p < 05)? 
———— ark Fulcomer who 


2 Thanks are € ressed tO à 
n form used in this 


designed the student evaluation 
study. 
3 In view of th 
e controversy as 

nace tests in psychological esearch (Cohen, 
D Welkowitz, Ewe? 

let i the decision as tO whicl 

to the reader. 


the claim that pressure was, in 
in statistics comes from three 
sources. The first concerns the obvious effects of 
the requirement that all psychology majors at 


New York University earn a grade of C or better 
i Second, the 


Support for 


in statistics, 9 


student evaluatio 
lt significantly less well prepared 


for this course by their prior academic training 
than students in abnormal psychology did for 
that course t= 3.76; two-tailed, ? < 001; Tp 
between course and satisfaction with previous 
preparation = 36). Thus, in addition to the fact 
that many students undoubtedly took the course 
in statistics only because of the requirement 


(rather than because of the course 
material), many of them also felt poorly prepared 


for the course and experienced it as difficult—a 


situation likely to Jead to feelings of considerable 
ird, the high mean satisfaction and 


fairly easy, relative to the ability of the students, 
with the result that pressure in 
fairly low. 


ConcLUSIONS 


Although there are obvious limitations in the 
data presented in this paper, the data do come 
from natural classroom settings of importance to 
the students, rather than from an artificial lab- 
oratory situation in which the main goal of the 
students is to earn credit for a course in intro- 
ductory psychology. The results do suggest, there- 
fore, that. differences in pressure for production 
and task difüculty may, at least to some extent, 
explain the perplexing variety of results that are 
obtained when researchers correlate job satisfac- 


tion with job performance. 
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The present study examines the relationship between need satisfaction and 
absenteeism for a sample of managers in a state organization. The data indicate 
isfaction and absenteeism. The larger the 


a positive relationship between dissa 
need deficiencies, a5 measured by Po "s questionnaire format, the greater is 


the rate of absenteeism. Additional analysis reveals that the relationship be- 
tween dissatisfaction and hen controlling for the effects of 


hierarchical level. 


absenteeism holds w 


setting. The effects of hierarchical level are also 
possible mediating influence on 
pothesis being tested is 
t absenteeism is positively correlated 
perceived need deficiency OT dis- 


satisfaction among managerial personnel. 


Many recent studies have dealt with perceived 
need deficiencies and dissatisfaction among mana- 
gerial personnel. Consistent with the original 
findings of Porter (1961, 1962), these studies simply tha 
have shown primarily that need deficiencies de- with levels of 
crease as one considers successiv i 
of management (Haire, Ghiselli, & Porter: 1963; etc 

Opinion Research Corporation, ; 

1964; Slocum, 1971). These data sugges re collected from a sample of 40 mana- 
hierarchical ascent is marked by 2 greater Te- a state liquor control board at 
Sponsibility, recognition, and positiona i ion. The managers repre- 
which contribute significantly to the satisfaction sent 4 hierarchical levels ranging from unit an! 
of needs important to the individual. i ud 

It has also been sugge ivati and board members. The state organization has 
» and satisfaction of managers P 
implications for the organization, 


t that Data we 


productivity, absenteeism, or turnover ark price S 

1960). Although some of the research which has tion of all licenses and permits. Consequently, 
dealt with the issue of need satisfaction an per- the present sample represents middle and upper 
formance has shown @ positive relationship be- managers invol ed with policy and procedura 
tween them (Lawler & Porter, 1967; Slocum, decisions over exi statewide control of traffic 

1971 iri a that examine in alcoholic everages. : 
the ), there are fow ved deficien- As in many areas of government, the liquor 
cies and absenteeism OT turnover. nsidering control organization is highly bureaucratic. In 
the cost to the organization of these problems, a Toe i t «es the n 
considerati i absenteeism OF turnover a5 eveloped a arge nU , reg! : 
Ww. à nsideration of Mt faction would b of grea and stan ard operating ar Ep. ps 
utility. tribute to its rigid nature. It was elt that this 
relationship organizational climate could have an effect on 
i f individual needs, especially 


b The present research examines the 
mum perceived need deficiencies 2? 
sm for a sample of managers in @ ur 


funded bY 
College of questionnaire format. Subjects Were aske 


autonomy and self-actualization. 
e need satisfactions of the managers 


ined by employing the Porter (1961 
d to 


1 H . 
This research was partially 


f 

| Bus the Central Fund for Rese rch, e 

i Administration, Pennsylvania tate Univer- d mond t D iib di A e d i 
. cating how much of the characteristic or quality 


ent to Lawrence " t 
pes A peing measured was presently connected with 


? Requests for reprints should 
uch they felt should be con- 


G. Hrebj 

k ebiniak, D tment 9 irj 

and eg asian ennsylvania State their job and how m e 1 
nected with their job. The items were predicated 
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TABLE 1 


CORRELATIONS BETWEEN ABSENTEEISM AND 
DEFICIENCY Scores FOR IrEMS IN EAcH 
NEED CATEGORY AND TOTAL 


DISSATISFACTION 
Need category and item Absenteeism 
CET etiaai 
Security Needs 7 
Security in Position 53 
Esteem needs 
Opportunity for recognition and esteem -40 
Respect for authority connected with 
Position 24 
Autonomy needs 
Opportunity for independent thought 
and action 36 
Opportunity for Participation in the 
setting of goals 31 
Opportunity for Participation in de- 
cisions regarding methods and pro- 
cedures 445 
Opportunities to challenge opinions of 
Supervisors 45 
Self-actualization needs 
Pportunity for personal growth and 
evelopment 39 
Total dissatisfaction .53* 


*b <.01, one-tailed test, A correlation of a .40 was needed 
for Significance at b —.01. 


to a large degree in Maslow's (1943) theory of 
need satisfaction, 


he responses to the two questions, in effect, 
represented subjects’ percepti 


» 1972); the larger the dif- 
en actual and i 
Ssatisfaction, 


ference betwe 


from c 
Procedures, 


RESULTS Anp Discussion 
Data collected i 


(the summation of discrepancy scores for E 
eight items) correlates significantly with m 
of days absent from the job (r = .530, $ <. a 
Responses to the Separate items are not WU 
pendent of each other or the total dissatisfac 1 1 
Score; correlations between each of the items S 
absenteeism, therefore, are shown in Table : 
without separate Probabilities of chance occur 
rence, j * 
In line with previous research, hierarchic 

level is Considered for its effects. The relation- 
Ship between level and dissatisfaction is not E 
tistically significant (r = —.288, p> .05). T 

Correlation, however, is in the predicted direc- 
tion. With a larger number of respondents, E 
typical negative relationship between level an 
dissatisfaction (Porter, 1961; Slocum, 1971) 
Probably would have held. There is, moreover, à 
negative relationship between level and eng 
ism (r= —.344, P «.05); the higher the jo 
level, the lower the absenteeism. In order to test 
for independent effects, a partial mage 
analysis was performed for absenteeism an 


order need-like sec 
absenteeism 


order needs, According to the 
Present study, however, negative Laren 
that is, absenteeism, appears to be correlated wi 
dissatisfaction or need deficiencies at any level © 
needs, iall 
More research is needed in this area, especially 
data that Consider both absenteeism and Ta 
Over in relation to need satisfaction. It is Pd 
ably true that, similar to the lack of felt Jie 
tensions (Hrebiniak & Alutto, 1972: Kahn Wo a 
Quinn, Snoek, & Rosenthal, 1964), the ate 
need dissatisfaction could be properly regar 
as an organizational asset. 
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ON THE DEVELOPMENT OF CONTINGENCY THEORIES OF 
LEADERSHIP: SOME METHODOLOGICAL CONSIDERATIONS 
AND A POSSIBLE ALTERNATIVE? 


ABRAHAM K. KORMAN 2 
Baruch College, City University of New York 


The contingency theory approach to leadership, 
which has as its major t i 


Which do not Support Fiedler's ( 
Contingency model 


^ 1 An earlier version of this 
University of Chicago Graduate 
in December 1971, I am indebted 
of the University » Raymond A Katzell 
Joseph Weitz, and Edward y Levine of New York 
University, and Richard Hoffm od 
man of the University of Chicago for their com 
ments on this earlier version, i 


? Requests for reprints should be sent 


in essence, knowledge of 
ve strategy is suggested, 


x= (a) dimension of leader behavior, 

Y= (a) criterion by which the leader’s effec- 
tiveness may be determined, and 

Z = some situational variable, 


then the Correlation between x and y is predicted 
to assume a different functional form at different 
levels of z. That is. it is predicted on an a priori 
theoretical basis that for level z, the correlation 
of x and y will be of Form A, and for level zg it 
Will assume a different form. Included here are 
Cases ranging from those in Which the relation- 
Ship between the contingency variable and the 
Predictor-criterion combinations is linear (i.e 
Tsy is always in the same direction but differs in 


between leader 
Performance or 


Noncontincency THEORIES of LEADERSHIP 


A noncontingency theory of leadership states 
that there js à significant correlation. between 
5"me form of leader behavior x and a cri- 
terion (y) and/or that there is a significant cor 
relation between a measure of situational MS 
Criterion (y). In this case t 4 
theorist Proposes that each of these ME 
Will be Significant without explicitly taking int 


; ich 
to Abraham ^ account an ossible contingency variable whic 
ie Y. v Yon Bauh Col might afte the predicted mdationship. In other 
ege, City Univ (d ew KORK, exingto, " jh: ~ 5 a art of any 
Avenue, New York, New York 10010, Sion " ords, While Contingencies are vu es variablts 
fory, in this case these con ing 
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are not formally incorporated into the theoretical 
statements on an à priori basis. As examples of 
noncontingency theorizing, we can cite various 
personality theories of leadership which have been 
developed, tested. and conceptualized without 
regard for situational parameters, for example, 
Ghiselli’s (1971) trait theory and McGregor's 
(1960) theory Y organizational approach to 
leadership. 


PROBLEMS IN SPECIFYING LEVELS OF THE 
CONTINGENCY VARIABLE 


In designing à theoretical test of the contin- 
gency theory of leadership, the researcher, aS 
part of his design, needs to state which levels of 
the contingency variable he will use in his re- 
search and to defend his claim as to why those 
levels constitute an adequate test of his model. 


= Without such knowledge prior to the undertaking 


of the research, a proper test of the contingency 
hypothesis cannot be made, as the following 
examples indicate: 


1. A researcher chooses levels where zu-89 
29 = 10, and 23 — 12 in a test of the hypotheses 
that r,,, = high positive at 21, "ev —Q at Zo, and 
Toy = high negative at 2a- He then fails to find sup- 
port for his hypothesis sin i 
Toy = high positive at all l 
conclusions may be in error, sinc 
able that the predicted conting! 
have been found if he had chosen levels of the 
contingency variable where zı = 8, 22 A 
z = 20. On the other hand, if prior statements 
concerning the level of z are not made (but only 
that « and y are positively related), then any 
result provides 4 basis for valid inference as to 
whether the hypothesis is viable © 


£9 = l^ 


r not. In the 


contingency case, an appropriate inference 15 not 


conceivable, since if the researcher 
levels where #1 = $: Z2 = = 
hence, found support for 
not obviate the possibility that the ¢ 
z4 24 level could have resulte 
fay that is high positive; this would 1 
more complex function. 


The researcher may avoid 
determinacies in appropriate infe 
a priori specifications as to the 
of the environmental variable. In o 
these specifications, there are two d 
SE be dealt with, ome Of logic and one © 
measurement. . 

First, the problem cannot De resolved logically 


unless the researcher has some preexisting empiri- 
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cal or theoretical reason for predicting at what 
specific level of the contingency variable there 
will be a functional change in the correlation 
between x and Y- 

The researcher may empirically derive the 
critical values of a contingency situational vari- 
able in preliminary research and then test to see 
whether and under what conditions these critical 
values hold. All that is crucial is that the behav- 
iors associated with the specific critical values 
be noted and attempts at replication be made. 
However, this approach has at least two of the 
disadvantages of any purely empirical procedure. 
First, it allows for an infinite number of levels 
of the situational variable to be tested in an 
infinite variety of situations. Second, it provides 
no guidance as to the conditions under which the 
relationships found will generalize to new situa- 
tions. In this sense, then, the relationships found 
must be situation-bound and time-bound, unless 
there is some reason to think otherwise. Because 
of these considerations, the adoption of a purely 


done in terms of the measuring opera- 
be used for assessing the contingency 
f the theoretical 
Jf they are not stated in 
dequate test 
have 


tions to 


constructs themselves. 
this manner, he will not have an à 
of the hypothesis for the reasons We 


indicated. 
A further complication is that the attempt to 


specify the scores on some measure which are 
“critical” involves the following assumptions: 


1, Different specifiable levels of the measure- 
ately reflect different levels 


of the theoretical construct that it is presum- 
ably measuring ie., the measure has construct 
validity)- 

2. The behavioral significance of each level of 


i rocedure in different situations is 
tically, information of this nature 
sis of normative studies © 
(or experimental 
d in different contexts. However, 
theoretical guidelines do not now exist by which 
ify the relevant situational parameters 
that would affect the behavioral significance of 
these different levels of the contingency variable, 
as measured. Hence, researchers in the field use 
“seat of the pants" procedures in assessing the 
significance of different scores. This, in turn, leads 


to contradictory findings. 


pee 
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Considering then, the questions of logic and of 
measurement involved in a priori specification of 
the critical values, it is not surprising that the 
results of tests of contingency leadership theories 
have been inconsistent. In contrast, an advantage 
of the noncontingency model is that a priori 
specification of the critical values is not required. 
The researcher may ignore all questions concern- 
ing the variation of the z variable, that is, it 
does not matter if he is dealing with a situation 


in which there is only one value (or an infinite 
number of values) of the z variable. Whatever 
values these contingencies assume in any given 
theoretical test of the model are unimportant for 
the purpose of drawing inferences from the re- 
search, since the same functional relationship be- 
tween x and y is always predicted and then 
observed for its occurrence in the given situation. 
When different relationships are found from those 
predicted, an inference that the theory is not 
supported is appropriate. 

The contingency theory researcher has other 
problems facing him as well. 


TIME PROBLEM 


It seems clear that a theoretical system which 
is based on the proposition that given Situa- 
tion A, Leader Behavior A is best, while given 
Situation B, Leader Behavior B is best, must 
make one of two assumptions. One is that the 
world is a "static" one in which the relationships 
being postulated are invariant. Such an assump- 
lion seems to be a naive one, considering all 
that we know about the nature of human behav- 
lor and its dynamic changing qualities, On the 
other hand, suppose the researcher makes the 
assumption of a “nonstatic world." Two cases 
then occur. In one case, he may further assume 
that in all tests of the model in different situa- 
tions, the person has been interacting with the 
Situation for the same period of time. Hence, 
only one critical value need be developed. In the 
other case, there is the alternative of developing 
values reflecting change in the situational critical 
values as a function of time in the situation, 

3 The first case is more naive than the assump- 
tion of à static world. The second case, that of 
injecting a parameter value representing time in 
situation (or degr 


1 €e of experience), has the effect 
of making the problems discussed in the previous 
section even more complex, since it is now neces- 


sary to consider the amount of time over which 
the specific contingency relationships to be tested 
have been taking place. Such considerations then 
have to be taken into account when developing 


critical values that will provide an adequate test 
of the model, 


n pow UD 


SHORT NOTES 


PROBLEMS CONCERNING THE SITUATIONAL 
MEASUREMENTS USED 


A third question concerning the testing g eer 
tingency leadership models involves the og ot 
necessity that different levels of the po S 
measures should differ from one another on te 
the manner specified. However, a dilemma bs pi 
one is using different levels and types of situ : 
tional variables is that the larger the variance a 
a measure, other things being equal, the a 
the correlation it can have with other variables. 
Hence, the fact that we are deliberately po 
ducing situational variation in our research W aa 
we engage in contingency theorizing means i 
we may introduce effects upon our results » 
variables that are correlated with our situation j 
variables. To the degree that this is true, iui 
ences made concerning the nature of the propose 
relationships may be incorrect. " 

As an example, consider a possible hypothesis 
in which a moderating effect of leader behavior 
is predicted for different kinds of physical envi- 
ronments. The logic here is that physical environ- 
ments may differ from one another in the degree 
of positive self-evaluation which they encourage. 
Could all results of such a test be attributed to 
this type of psychological variable? Such a con- 
clusion would be doubtful, considering what we 
know about the “arousal” properties of different 
kinds of physical environments and the signifi- 
cance of different levels of arousal for behavior. 


Wat SHALL WE Do? 


Before suggesting an alternative, it is desirable 
to stress that there is no inclination to argue oe 
these problems are unsolvable. Clearly they a 
not. Nor are these points new, They are antic 
pated in the work of Fleishman and Har 
(1962) who pointed out the need to be concerne 
with the Jevels of supervisory consideration an4 
initiating structure measured in a particular situ 
ation since some levels of these variables are a 
little behavioral significance in affecting empl0Y® 
turnover and grievances, while others are of gre? 
significance. ne 

There are two problems to be solved if es 
tingency models are to fulfill their promise ae i 
retically and if they are to provide a guide zal 
practice. First, the measurements must have sat 
Struct validity. Second, we need to know bi 
each level of the contingency variable euenit 
ment means in terms of behavioral eer these 
Research needs to be undertaken in bot 
directions. s 

A second redirection stems from fuas) that 
Programmatic equation of Lewin ( 


He 
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E mend. E (environment) ]. We need to 
E e effects of personality variables and situ- 
E e both cross-sectionally and longi- 
qe ally, as separate variables affecting behav- 
a e as joint variables affecting behavior 
Bons er an additive or multiplicative rela- 
my E to one another. However, what would 
that e predicted on an a priori basis would be 
E the behavioral significance of any value of 
s E independent variable would show interactive, 
onlinear relationships with the particular values 
of the other. Thus, if 


x = criterion variable, 
y = situational variable, and 
z = criterion variable, 


PN eur ee and develop predictions for such 
In m aS rss, Prp Pieun 3 and/or ris. yy at 
i ition to these theoretically predicted rela- 
tionships, we would also be highly sensitive to 
and carefully analyze the nonpredicted inter- 
actions when they occur. Eventually, then, out of 
such an empirical data-building process, we may 
be able to develop sufficient knowledge to actu- 
ally enable us to make the a priori predictions 
constituting the contingency model. 

To conclude, contingency predictions are even- 
tually always necessary since every theory is à 
Contingency theory in that limitations (or con- 
tingencies) to delimit the range of the theory 
need to be brought into the picture. The ques- 
tion is at what point do we propose à priori 
hypotheses concerning the limitations (or con- 
tingencies)? Contingency theorists bring in con- 
lingency hypotheses at the beginning of the 
theory-building process. We also suggest utilizing 
the alternative strategy of bring them in when 
they are empirically supported and shown to be 
needed. The fruitfulness of this procedure has 
been shown by researchers dealing with con- 
sideration and initiating structure (cf. Fleishman, 
in press). Its value is that it overcomes the 
logical and methodological problems we have 
discussed. Using both procedures appropriately 

at may be noted that the term (x): x is a 
moderator variable term whereby the predictive 
effectiveness of x may change as a function of the 
Particular value of Y. However, this is not a con- 
tingency variable model in the sense that we are 
discussing it here. Zedeck (1971) has recently pointed 
Out these important differences between different 
types of moderator variables in personnel selection 


‘theory, although the major thrust of his discussion 
is not aimed at the kinds of questions which have 


concerned us here 
,.! I am indebted to John B- Miner of the Univer- 
sity of Maryland for this observation. 
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should lead to better, more replicable theories of 
leadership in the future. 
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DIMENSIONAL ANALYSIS OF ATTITUDES TOWARD 
COMMERCIAL FLYING 


ARNON PERRY 


Tel-Aviv University, Israel 


METHOD 


A 50-item attitude questionnaire embodying 5 
“construct” factors—safety, economy, Service, 
Speed, and prestige—was Constructed, Ea. 
required a Tesponse on a 7 
(neutral point omitted) ra 
agree to strongly disagree? 
administered during early Di 

niversity of Texas, Austin 
students who rı Y. In addition 
to Tesponding to the 50-item form, 
dents indicated their sex and whether they had 
flown before Or not. 

All correlation coeffici 
the 50 attitude questio, 


stoa group of 355 


Findings 


remaining 3 factors appeared to account for a 
very small Ortion o the common variance, 

i entage of variance 
actor, representative 


Table 1, 
Variance acc 


t uniform and 


S. THOMAS FRIEDMAN 1 


University of Texas, Austin 


narrow range (3.2-5.6%). This is not very bet! 
the case in a factor structure of this magnitude. 
Usually one or two factors account for a large 
share of the common variance of the data, This 
even variance distribution suggests that the 


Variables Associated with Factor Pattern 


In addition to revealing the cognitive i 
of college students with Tespect to flying and t 
airline industr 


and flying experience would be related to this 
attitudinal structure, 


actors and the sex and flying 
experience of the respondents, Only five ae 
showed a Significant relation to at least one “i 
Only 44 out of 355 subjects ha 
never flown in this group of college MIR E 
Those who had flown scored significantly high 
; that is, they believe 


Were significant, with the ix 
males having more Postive attitudes about air 
Ore product conscious. 1 flying 
In Conclusion, it appears that commercia the 
is a widely accepted phenomenon kem 
college-age population, that Bale aoe a 
commercia] flying are highly ee e fac- 
rmly str rtain 0 
Y Structured, and that ce cie 
Cation of subjects into ayeri possible 
misleading, since it is qui just hap- 
Were classified as nonya ied they 
ese by “chance” simply be 
°pportunity to fly. 


2 The classifi 

yers may be 
that those that 
Pened to be th 
never had the 
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TABLE 1 
PRINCIPAL AXIS (VARIMAN ROTATED) FACTORS OBTAINED FOR A S0-IrEM ATTITUDE QUESTIONNAIRE 
: TOWARD FLYING AND AIRLINE INDUSTRY 


Represen tative item Loading 


Factor 1: Advertising effectiveness (5.6% variance) 


a 


34, All the money airlines spend on advertising is 2 waste, because what really matters to people when —.76 


they decide to fly is the departure and arrival time- 
35. The slogans airlines use in their advertising are very effective. 6l 


Factor 2: Return for the money (3.196 variance) 


o the extras of air travel like food, movies, etc. for a reduction in 


13. Most people will be willing to foreg' 
air fare. 
—.54 


12. Air travel is too expensive compared to other means of transportation. 


— Factor 3: Image of the industry (4.5% variance) 


29, Other things being equal, most people will be glad to work for an airline. a 


32. Airlines’ advertising is truthful. 


Factor 4: Airline differentiation (4.1% variance) 


—.74 


5, All airlines are the same. E 
10, Some airlines are better than others. : 


Factor 5: Age and safety (3.4% variance) 


25, The attitude toward safety with respect to flying has nothing to do with the passenger's age- .79 
22. Young people do not give a5 much weight to the satety factor 1n flying as compared to older —.16 
people. 
Factor 6: Role of travel agency (3.4% variance) 
i k i i isi v irline to use. Jn 
45. A mmendation of a travel agent makes all the difference 1n the decision what air 
43. When I vn io fly, I prefer to contact the airline directly, rather than going through-a —.61 
travel agency. 
Factor 7: Government regulations (3.4% variance) 
84 
37. The airline industry should be less regulated bY the government. i 
40. Government regulations restrict the development of the airline industry 79 
g Factor 8: Disadvantage of flying (3.7% variance) 
3. While traveling on vacation, I prefer to drive so that I can stop and go anywhere I want, and 11 
have thi hange m: mind. : . 
4. SEE e o ee over flying js that the traveler is not confined to à specific pre- 68 
planned schedule. 
Factor 9: Discrimination among passengers (3.5% variance) 
hose who fly for personal purposes or vaca- 19 


49. The airlines should give * different treatment to tl 
" ho fly for business purposes. 


tion, as compared to those W 
48. Airlines Een make 2 distinction between those who pay for their trip out of their own pocket 79 
and those who pay for their trip from expense accounts. 
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Table 1— (Continued) 


Factor 10: Transportation mode selection (3.1% 


variance) 


2. Most people prefer to fly rather than drive. 


27. People who travel for a long distance by car do it mainly for a need of ti 
destination. 


ransportation at their 


Factor 11: Safety (4.1% variance) 


21. Flying is unsafe, 
26. Flying is as safe as any other mode of transporatation, and probably safi 


er. 


Factor 12: Airline's performance (3.3% vari 


ance) 


7, An airline's reputation depends on time of departure and arrival. 
6. All people really care about when they fly is to get to their destination. 


Factor 13: Service (3.6% variance) 


ceived from stewardesses 


ost airlines offer to young 


travelers under 
me and only antagonizes 


Waste a lot of ti; them, 


Factor 14: Discount for young passengers (3.2% 


on board the plane is the most important factor in 
attitude toward an airline, 


22 ona standby basis makes them 


variance) 


19. Without the discount to young travelers under 22, most of them would 
15. Most of those who use the youth discount fare are for trips they would 
if the discount were Not offered, 


Note. N = 355, 


tors are related to ff 
Separate dimensions 


Promotional ca; 
lines 


stress more of 
of flying, 


Provide areas by which air- 
consumer attitudes 


23, 187-200. 


not have been able to fly. 


Kaiser, H, F. The va 
rotation in factor a 


have taken anyway, even 


mpaigns aimed at women should 


the safety and security aspects 
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A PRELIMINARY STUDY OF THE EFFECTS OF CRASH 
HELMET VISOR COLOR ON COLOR RECOGNITION 


HAROLD D. WARNER * 


University of Missouri at Rolla 


The purpose of this study was to investigate the 
jects were asked to report whether 2 


color on color recognition. Sixty su 


smoke, and yellow. The results show 
number of color recognition error: 


A bulletin issued from the Manitoba Govern- 
ment News Service (1971) advised motor- 
Cyclists who possessed plue-tinted bubble-type 
1 helmet visors to destroy them. It was claime 
- that the visors “make it impossible to distinguish 

the color red.” Two motorcycle fatalities in 

which blue visors had been worn, were cited as 
proof of the warning. 

A poll conducted at the University of Missouri 
at Rolla campus involving 2 sample of 100 stu- 
dents showed that motorcyclists and n 
cyclists were not aware of the po 
of blue helmet visors. Since S 
interviewed in the survey owned colore 
visors other than blue, à frequent question Was: 

"What effects do other colored visors have on 
driving safety?” 

A review of the relevant literature indicated 
that the hazard associated with cra i 


preliminary study 0 
visor colors on color recognition Wa 


METHOD 
Subjects 
Sixty male and female student volunteers Were 
tested. The students were enrolled in 2n intro 
ven extra 


ductory psychology C? 
credit for participating- Subjects were | 


for visual color deficiencies prior to testing- 


Performance Task 


The display conditions of the task 
16 35-millimeter slide presentations. 
contained a combination 9 d 
Structed from paper cutouts. Th 
beige, black, blue, brown, orange, P! 1 
Bae use, 30 jolet. While all the slides 


1 

D ng for reprints should b 

sity arner, Department of Soci 
of Missouri, Rolla, Missouri © 


effects of crash helmet visor 


yellow, green, or none of these 


the slides through either à colored crash helmet 


d were blue, green, orange, 


ed that the colored visors increased the 
s. Restricted use of colored visors is advised. 


contained at least these colors, in 4 slides the 
color red was added; in 4, yellow was added; and 
in 4, green was added. In the remaining 4 slides, 
none of these three colors was present. To insure 
uniform color brightness and color purity, sub- 
ts of equality Were obtained from 
judges before a color was included. 
The paper cutouts were arranged differently in 
each slide to prevent the association of particular 
colors with particular positions. 

During testing, subjects were enclosed in à 
3x 3x6 foot test cubicle. The subjects viewed 
the screen on which the slides were displayed 
through an open “window” 
the cubicle. A response 
cubicle consisting of fo 


Helmet Visors 

] crash helmet visors (Paulson 
n , Model No. 3) were pur- 
m a motorcycle equipment shop. The 
visors were plue, yellow, orange, smoke, green, 


and clear. 


Procedure 
seated in the cubicle and were 
ions. They were in- 
Switch Number 1 if they saw 


of six experimental conditions. The experimen! 
conditions were defined by the five colored hel- 
met visors and the clear visor. The appropriate 
visor for each subject was placed over the open 
" after the demonstration slides were 


shown. 
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TABLE 1 


Cotor RECOGNITION ERRORS as a Function or 
Crasu HELMET Visor Convition 


Error distribution 
Visor eval 
Red | Yellow Green | Other 

Clear 15 1 2 2 10 
Smoke 32 5 3 6 18 
Green 35 6 9 5 15 
Orange 38 6 9 8 15 
Blue 48 8 21 8 11 
Yellow 58 9 12 12 27 


Since each subject vie 
total of 160 responses 
experimental condition, 
Series were presented 


wed 16 slide exposures, a 
Were recorded for each 
The 16 slides in each test 
at à rate of | slide per 


minute, 
REsULTS 
Incorrect Tesponses were analyzed by means 
of a lysis of i i 


SHORT Notes 


df = 18, P « 05); blue-clear (£25.51, df — 8j 
5 «.01); 
< 01). 

The proportion of incorrect responses to in 
total number of responses (160) was also calc 
lated for each visor condition. The percentage 0. 
errors for the Clear, smoke, green, orange, oe 
and yellow conditions are 9.4, 20.0, 21.9, 23.8, 
30.0, and 36.3%, respectively, 


Discussion 


The results of this study indicate that some 
colored crash helmet visors, namely, green, ie 
and yellow, increase the probability of inaccu 
rately identifying colors such as red 


recommended. With regard to ng 
ue visors make it impossible 


that the blue visor hampers accurate recognition 
of red but does not make it “impossible” to seet 

Although only crash helmet visors were used ke 
the present investigation, the results are nO 
limited to this type of colored light filter. Other 
filters, such as colored sunglasses, can be ex- 
Pected to have similar effects on color recognition. 
Thus, it can be assumed that a proportion 0 
automobile accidents are due to colored sunglasses 
worn by accident-involved drivers, For this rea- 
Son, restricted use of Colored sunglasses is also 
advised, s 

It should be noted that the stimulus materials 
sed in this study did not permit the- utilization 
of supplementa] detection cues which are norma 
to driving situations. These include stimulus 
orm, patterning, and Position. Since these e 
Were unavailable, the Present data would E 
applicable mainly to Situations in which the 
driver relies on color cues, 
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Services Branch, L 
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> 1971. (Press release) 
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and yellow-clear (¢ = 3.78, df = 18, ? 


Color red, the present data suggest N 
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AN AUTOMATED PATIENT BEHAVIOR CHECELIST* 


DONALD W. 


MORGAN; JEFFREY L. CRAWFORD, SINAI I. FRENKEL 


Walter Reed General Hospital 


JAMES L. 


Missouri Instit 


The development of an automated, 


HEDLUND 
ute of Psychiatry 


47-item patient behavioral observation 


checklist is described. A factor analysis of 689 sets of ratings on 103 patients 


yielded four factors: 
and adaptation to the 


acting out, depression-withdrawal, degree of disturbance, 
ward. A sample of a patient's behavioral record is 


presented and the applications and implications of the automated procedure 


are discussed. 


Computer Support in Military Psychiatry isa 
Clinically oriented computer applications research 


- Project that is developing and testing concepts 


for an integrated psychiatric information system. 
One of the developments of Computer Support 
in Military Psychiatry is an automated patient 
behavior checklist to be used by ward personnel 
to evaluate patient status On a day-to-day basis. 
'The present paper discusses the development, 
application, and implications of the automated 
patient behavior checklist. 

In an inpatient psychiatric treatment milieu, 
hospital personnel (nurses, technicians, etc.) ordi- 
narily have the greatest opportunity to observe 
a wide range of patient behaviors. These obser- 
vations, although seldom systematically recorded, 
usually take the form of handwritten notes that 
are included in the patient's medical chart. Indi- 
vidual differences in training, conceptual frame- 
work, language use; and style of expression tend 
to make such observations highly subjective, un- 
reliable, and, consequently, difficult to use as a 
major source of information. On the ward, in- 
formal verbal communications among nursing 
personnel, ward technicians, psychologists, and 
"BE RC M 

i'This study was conducted as part of a larger 
clinical computer applications research project, Com- 
in Military Psychiatry, and was 
the U.S. Army Medical Research 
and Development Command, Washington; D.C. The 
acknowledge and express their 
to Bernard C. Glueck, Jr. and his staff 
at the Institute of Living, Hartford, Connecticut, for 
their strong support and encouragement throughout 


this project. 
2 Requests and for the patient be- 
havior checklist and documented programs should 
be sent to Donald W- Morgan, Department of 
Psychiatry and Neurology, Computer Support in 
Military Psychiatry, Reed Army Medical 


Center, Washington, D.C 


are often the principal media for 
information exchange. The behaviors that are 
usually noted tend to be those that are extremely 
destructive and negative. Behaviors that do not 
interfere with ward routine tend to go unnoticed, 
or at the very least, unreported. As a result, much 
important data concerning patient behavior while 
in the hospital is lost. 

The Institute of Living has developed and uti- 
lized an automated nursing notes system to ob- 
tain day-to-day observations of psychiatric pa- 
tients throughout their course of hospitalization 
(Rosenberg & Glueck, 1967, 1969; Rosenberg, 
Glueck, & Bennett, 


psychiatrists 


1967). Their Patient Be- 
havior Index consists of a two-page printed form 
containing objective, checklist descriptions of 
patient behavior. 

Because of the potential advantages of such a 
nursing note system, Computer Support in Mili- 
tary Psychiatry’s initial pilot study investigated 
the Patient Behavior Index’s direct applicability 
to an Army psychiatric treatment milieu and pa- 
tient population. The results of this study indi- 
cated that the contents of many Patient Behavior 
Index items were inapplicable to the Walter Reed 
treatment milieu and patient populations. Second, 
the Patient Behavior Index procedure of rating 
only those items considered to be present re- 
sulted in frequent, jnadvertent rating omissions 
and low observer reliability. The Computer Sup- 
port in Military Psychiatry staff, therefore, made 
the decision to develop an automated behavior- 
rating checklist utilizing the pilot data, together 
with some of the Patient Behavior Index data on 
factor structure that would be directly applicable 
to the Walter Reed treatment milieu. 


METHOD 
Procedure 


f Of the 112 Patient Behavior Index behavioral 
items, 50 were selected as having optimal rele- 
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TABLE 1 
BEHAVIORAL OBSERVATION FACTORS AND ITEM Loapincs 


" ; vard 
Depression/withd Degree of disturbance Adaptation to war 
Acting out factor p onpil eal i: tentar factor 
Item Item Loading Item Loading Item Loading 
—.46 
Impolite Monotone voice | 767.77 Restless -26/.53 ates — einer 
Impulsive ‘oor hygiene 36/.41 Hallucinations 04/.54 en disposition 7 
Irritable Zating 37/.34 acing -20/.54 | p. ticipated 
Sarcastic Inadequate 39/.26 | Odd :50/.27 Relaxed 
Demanding Fearful 58/.18 Confused -26/.63 Clean/neat 
Angry 78/71 | Slow speech 82/.67 uspicious :28/.35 
Threatening 617.77 Tearful . Rambled -62/.06 
Vulgar language 49/.65 Seclusive - orgetful :48/.33 
Annoyed 43/.60 lept s epetitious -51/.49 
Resentful 71/.64 Apathetic 45/.57 aranoid 237.29 
Argumentative 71/.73 | Slow 80/.75 ught attention -61/.21 
Griped 62/.61 Suicidal 10/.02 nsomnia 117.29 
"Moody 47/.60 | Drowsy 19/.79 | Overactive 1547.32 
ad 517.51 oisterous -38/.02 
Quiet 717.58 
Note. The numbers next to each behav; 


vance to the military psychiatric patients and 
treatment milieu, minimal overlap or redundancy, 


observer reliability, while at the 


example, pacing: walked bac 
and forth. N, 


RESuLts AND Discussioy 

The factor analysis resulted in 
tered into four actors: 
withdrawal, degree of di 
tion to the ward. T. 


ioral item Tepresent the factor loadings for the day and evi 


ening shifts, respectively. 


puter and factor 


Scored. By totaling the raw 
Scores for each 


patient on each factor based upon 


jects. graph of the Patient's behavior over 
time js then Printed out and progress can be 
Viewed on a day-to-day or Weekly basis (see Fig- 
ure 1), 


daily ratings 
room on the 
graph. Then, the latest daily rating simply Te- 


Point to make room for 


ization plus ratings for the :ent's 
Second, a daily Summary report of the patien 
behavior can be Printed, if desired by the stan 
Currently, work is being done to devel. 
weekly Summary report as well as a list of ior 
Cators to aid in predicting maladaptive bas 
Such as being away without leave, ward violence; 
etc, i 
Staff reaction to the Computer Support A 
Military Psychiatry patient behavior observa 
tions was mixed. The staff of the milieu pos 
Ward, although Cooperative in helping us es 
data for this research project, men pee 
make the Computer patient behavior observati 


-—M 


SHORT NOTES 


VERY SEVERE- +- 


SEVERE» » 4 ***** 


MINIMAL ee... 


1 
Li 
I 
1 
SIGNIFICANT -** 1 
1 
1 

I 
LITTLE OR NONE ! 
I 


NO OBSERVATION L--*-$*-8-- 
N22? 
5671 
D 
A very SEVERE- -+ 1 
T 1 
E SEVERE. «2*7 1 
I 
a SIGNIFICANT. +- 1 
1 
MINIMAL«« s *** I 
^ 1 
D LITTLE OR NONE I 
M 
1 NO OBSERVATIO L--&-8-5-- 
5 ov 222 
E 567 
E eee gs 
0 
N yery SEVERE- -° I 
1 
2 SEVERE. «77 H 
4 1 
SIGNIFICANT. ** 1 
N I 
o MINIMALe e*t T 
v I 
LITTLE OR NONE I 
1 1 
9 NO OBSERVATION 
7 N 


EXCELLENT «***7 
MODERATE. «**** 
POOReessst**? 
very POOR««*** 
EXTREMELY POOR 


NO OBSERVATION 


patient profile. (The 


FicURE 1. Sample 
t designate: 


he Letter E 


operational. However, the staff on an operant 
conditioning ward (token economy) Was more 
the present time. the computer 


enthusiastic. At 
patient behavior observation is being utilized in 


the operant ward as 2 measure of response gen- 
eralization. The computer patient behavior obser- 
vations are also kept 0 selected patients through- 


asterisks indicate factor 
s factor placement for 
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ACTING OUT 


s > SE*ETE 


placement on the day shift, while 


the evening shift.) 


out the psychiatry service. It is planned that the 
computer patient behavior observations will be 
used on all of the psychiatric wards at Walter 
Reed. The information obtained from the com- 
i patient behavior observations should aid 
hospital personnel in decisions regarding the ef- 
fectiveness of treatment programs, passes, and 


puter 
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leaves as well as decisions concerning termination 
of hospitalization. 
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CONSTRUCT VALIDATION OF AN INSTRUMENTALITY- 
EXPECTAN CY-TASK-GOAL MODEL OF WORK 


SOME THEORETICAL BOUNDARY CONDITIONS * 
H. PETER DACHLER? AND WILLIAM H. MOBLEY ? 
University of Maryland. 


An isstrumentality-expectancj-tade te model that sought to add specificity 
variables by focusing on 


to the conceptualization and 
d in two organizations. Support for the 


employees’ decision process was teste 
major links in the model was found in one of the organizations, while the 
n generally failed to support the hypotheses. 


results from the other organizatio: 

ntion is given to the potential impact of the organizational environ- 
ment on employee cognitions and motivation. The differences in results 
between the two organizations are attributed to possible boundary conditions 
that may affect the relationships among employee cognitions, task goal selec- 
tion, and performance. Implications of the basic model and its boundary 
conditions for rese 
model facilitated the observation © 


adding to the understanding of 


Special atte 


and theory con- (Campbell et al., 1970; Graen, 1969; Mitch- 
e given increasing ell & Biglan, 1971). However, à closer exami- 
attention to molence-instrumentality e nation of the research reveals (4) à number 
E tancy (VIE) models (see Campbell, Dun- of inconsistent findings, (b) rather modest 
nette, Lawler, & Weick 1970; Graen, 1969; relationships between the theoretical compo- 
Lawler, 1971; Porter & Lawler, 1968; Vroom, nents and performance, and (c) inadequate 
1964). Reviewers have generally concluded support for the hypothesized interaction 
that empirical tests of VIE models have pro- effects of the theoretical components on be- 
vided moderate support and that therefore havior (Heneman & Schwab, 1972; Miner & 

VIE theory shows promise of providing a sci- Dachler, 1973; Mitchell, 197 2) 
entifically useful model of work motivation This state of VIE theory is at least in part 
attributable to limitations that have charac- 


1 This research project was made possible by finan- = 

the General Research Board and various agencies and companies involved. Appre- 

wan. the Department of Psychology of the University 9 ciation is expressed to these people who are too 
numerous to mention. We owe 2 great deal to Tove 


Maryland, as well as by funds provided by the Office 
through Community Action Hammen, Gene Hoffman, Joe Schneider, and Kent 
Program Grant 2653. Computer time was provide Boyd, all of whom spent long hours in helping over- 


f the University come some of the many problems presented. by ive 
scope and nature of this study. Th 
work for a larger study of organizational behavior of our colleagues, Ed Locke ds eec s s 
i ed doctoral dissertation submitted On earlier drafts of this paper are gr Set E = n: 
nd author to the Graduate Faculty of the edged. 8 y acknow?- 
University of Maryland. The authors have been listed 2 Requests for s 
k "an reprints sh 
alphabetically 3s truly joint authors. pachler: At Ee poe Le o T E 
This project has depended on the effort, enthu- Maryland, College Park, Maryland Pag sity 0 
i , B 
cooperation of many 3 Now at College of Business Administration, Uni- 


siasm, good will, skills, and 
people both at the University of Maryland and the versity of South Carolina 
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terized much of the existing research. There 
has been a disturbing amount of inconsistency 
in the definition and measurement of the 
theoretical components, and the analytical 
Procedures used in many of the studies indi- 
cate confusion in the interpretation of the 
basic VIE model (Heneman & Schwab, 1972 ; 
Mitchell, 1972). Therefore the design of 
studies and the measures used to assess com- 
ponent variables often have not conformed to 
the strict requirements of VIE theory. 

The purpose of the present research was to 
attempt to improve the understanding of the 
motivation process delineated by VIE theory 
through greater specificity in the conceptual- 
ization and measurement of the key variables 
and the interrelationships among key vari. 
ables and behavior, Cronbach and Meehl 
(1955) and Dulany (1968), among others, 
have argued for the necessity of subjecting 
detailed theoretical networks to research asa 
means of making our Constructs more explicit, 

This Paper outlines a detailed VIE model 


model are Presented together 
Plications for VIE theory, 


Characteristics of 


recorded research, the model outlined in 
Figure 1 Was develo 


plain the Process of work motivation in ter 
of employee Cognitio i 


tential performance levels tha 
formance Which, on the b 
liefs and feelings, is thou 
useful for 


H. PETER DACHLER AND Wittram H. Mogrirv 


to be the most useful among the various 
potential performance levels at which he could 
Work. Zf employees make an intentional choice 
among different possible performance levels 
at which they might work, the question 
becomes, What level of performance does a 
worker choose to work toward and why? 1 

The hypothesized answer to this question ls 
laid out Systematically in Figure 1. Cell lis 
expectancy, which refers to a person’s subjec- 
tive probability, or the perceived likelihood 
that he can perform at a given level of perfor- 
mance. This term can be thought of as vary- 
ing from 0 to 1. Other factors being equal, the 
lower the Perceived chances of attaining a 
given level of performance, the less likely a 
Person will be to try to perform at that level. 

Cell 2 in Figure 1 is utility. Utility refers 
to the usefulness or attraction of a particular 
level of performance. The utility of working 
at a given level of performance is a result of 
the combination of two factors: 


Cell 2a, per formance—work outcome 
probabilities, that is, the instrumentality 
or how certain the employee is that a 
given level of Performance will lead to 
various rewarding or punishing con- 
Sequences (pay, recognition, boredom, 


pressure, advancement, Broup acceptance, 
etc.— Cel] Ph. 


Cell 2b, work 
that is, the valenc 


outcomes are, 


Figure 1 Shows that the person's perceived 
chances that a given level of performance will 
lead to a given Work outcome (Cell 2a) are 
combined multiplicatively with the desir- 
ability of that work outcome (Cell 2b). This 
aspect of the model reflects the hypothesis 
that utility is related to the interaction of 
instrumentality with work outcome desir- 
ability, Obviously there is more than one 
Work outcome associated with each possible 
level of Performance. For each work outcome, 
then, the Product of the probability statement 
(relating a given level of performance to 4 
given work Outcome) times the desirability E: 
that work Outcome is calculated, and all o 
the products are summed. The result eased 
one number summarizing the extent to whic 


B 


(9) 
situational 
Restraints 


(2) 
uritity of Level 
o! Performance 


(25) 
Work Outcome 
Desirobility 


z (20) 
Performance - work 
Outcome Probilitios 


Fic. 1. Model of work motivation. 


an employee feels he will be rewarded or pun- 
ished for performing at a given level of per- 
formance. This number represents the index 
of utility or attraction for a given level of 
performance (Arrow A. i 
Desirability of a work outcome is con- 
sidered to vary from y desirable) to 
=1 (very undesirable). 
work-outcome probability varies from +1 (a 
level of performance is sure to lead to an out- 
come) to O (a level of performance is not 
related to an outcome). 
perceived probability that working 
ular level of performance lea 
that are desirable (and does no l 
e undesirable), the higher is the 


utility of that level of performance. — . 
B in Figure 1 illustrates the impot- 


tance of à person's past experience In deter- 


mining what his outcome probability 
in a given situation. The degree tO which a 
iven level of performance in a previously 
similar situation has been closely followed by 
certain consequences (Arrow C) influences a 
person's belief concerning the consequences or 


outcomes to which that level of performance 


will presently lead. 
Cell 5 in Figure 1 
with maximum expecte 


(level oj performance 
d utility) represents an 


Level of Performance, 
Max. Expected Utility 
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(8) 
Ability 


(3) 


(6) 
Performance 


(7) 
work Outcomes 
(reword and punishment) 


index of motivation indicating at which of the 
various possible productivity levels a person 
would choose to work. For each possible level 
of performance, we can assess (a) the utility 
or attractiveness (Cell 2) of that level of per- 
formance and (5) the expectancy (Cell 1) for 
that level of performance. The model of work 
ifies further that the expec- 
tancy for a level of performance should be 
multiplied by the utility of that level of per- 
formance to result in the expected utility for 
that level of performance. The multiplication 
of the expectancy term with the utility term 
illustrates the fact that if either of these terms 
is zero, the person would not be expected to 
exert effort (to be motivated) toward reaching 
a given level of performance. Thus, to be 
motivated to work at a particular level of per- 
formance, @ person must not only feel that he 
can actually achieve that level of performance 
(high expectancy), but that this level of per- 
formance is also an attractive or useful one 
(high utility). 

If we assume that people tend to maximize 
returns from their job, then motivation is 
expected to be highest for that level of perfor- 
mance which has the highest expected utility 
in comparison to other possible levels of per- 
formance a person might choose to work 
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toward (Arrow E). We further assume that 
different performance levels are arranged 
along some dimension of difficulty so that 
achievement of higher performance levels 
would require increasing amounts of effort, 
On the basis of these assumptions, the present 
motivation model specifies that the higher 
the performance level for which a given 
person indicates the maximum expected utility 
(among the various potential performance 
levels at which he could work), the higher 
would be the effort (Cell 5) expended by that 


authors is the 
of effort is guided by an individual's goals, by 


io explore some of 


ply indicates that perfor 


effort ex enditure, [ 
Words, the Strength of A person’s Mos 
to Perform at a particular level js most 
his effort, Effort, in turn, 
in that Person’s perfor. 


...? * Person must 
Possess the necessary abilities to 


motivation and ability interact to determine 
performance. In other words, unless both 
ability and effort are high, there cannot be 
good performance. Furthermore, Arrow N 
indicates that a person's perceptions of his 
Own skills and capabilities may have an 
influence on his expectancies or his perceived 
chances of actually being able to achieve vari- 
ous levels of performance. If over a period of 
time employees change their perceptions about 
their own relevant skills and abilities, their 
Perceived chances of reaching various levels 
of performance should also change, all other 
things being equal. 

The second qualification is indicated by 
Cell 9 and Arrow L. Situational restraints like 
machine downtime, lack of materials, and 
other factors not under the control of the 
individual may prevent the achievement of a 
certain level of performance even when ability 
and motivation are high and the person cor- 
rectly directs his efforts. However, the present 
model includes a feedback link (Arrow M) 
between situational restraints (Cell 9) and 
expectancy (Cell 1), Thus, over a period of 
time an individua] experiencing such situ- 
ational restraints 


expectation of attaining different levels of 
performance, 


CONCEPTUAL AND METHODOLOGICAL ISSUES 
model presented 

above and the research to be reported have 

attempted to deal with some of the problems 


conceptualizations and 
research. 


Interaction of Valence, Tnstrumentality, and 


Expectancy with Organizational Environment 


_ A number of authors have argued for the 
integration of personalistic and structuralistic 
views of organizational behavior (Lichtman & 
Hunt, 1971; Schneider & Bartlett, 1968; 

room, 1964), Although some theorists 
Graen, 1969; Porter & Lawler, 1968) have 
extended the traditionally ahistorical nature 
of VIE theory by proposing that the hypothe- 
Sized cognitive Processes are a function of the 
Interaction between characteristics of the per- 
sonality ang the work environment, aly 3 
limited Tesearch effort has been directed 
toward understanding this important aspect 
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of VIE theory. Studies that 
assessed the impact 
acteristics on the cognitive components of VIE 
theory are few 970; 
Graen, 1969; Schneider & Olson, 
relatively limited, in that only à few 
mental characteristics were 
VIE theory does not include specific state- 
ments about how expectancies and instrumen- 
talities are acquired and how iti 
change in response to experien 
ceptions 
sary to investigate more 
environmental characteristics might in 
with the cognitive processes stipulated by VIE 
theory. The present research explored the 
hypothesis that the relationship between VIE 
i performance may be mod- 


t, it is neces- 
systematically how 


zational constraints (or facilitators), 
as by the length of time that an em 
had to accurately establish what tas 
of the constraints an 
are. 


Definition and Measurement of Valence 


Mitchell (1972) and 
(1972) have discussed the pr 
tion of work outcomes, 
their specificity. Equally 
] definition of valence. 


Vroom 


pated satisfaction. 

:nguish between 
experiented satisfaction antici- 
pated satisfaction (valence). 


variables. 
whether Or 
is a useful one, valence and value 


outcome valence has £ 
e measure of importance. 
no theoretical or empirical basis to 
equate the concept of valence or anticipated 
satisfaction with the concept of importance. 
Tf importance ignifies the existence of some 
needs, values, OT goals that a person may 


have, then the anticipated satisfaction (val- 


present 
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have directly ence), just like experienced satisfaction 

of environmental char- (value), would not only be 4 reflection of 
importance but of some interplay between 


both importance and what a person perceives 
he would be getting from à given outcome 
(see Locke, 1969). A number of researchers 
have already attempted to show that experi- 
enced satisfaction and importance are empir- 
ically distinguishable (Dachler & Hulin, 
1969; Locke, 1969; Mobley & Locke, 1970). 
Tt remains to be shown whether or not it is 
useful to keep separate the concepts of impor- 
tance and anticipated satisfaction OT valence. 
On the basis of current theory, however; 
valence and importance seem to be separate 
terms; and therefore, for purposes of testing 
VIE theory, operational definitions of valence 
in terms of outcome importance may not be 
appropriate. 

There are a number of additional problems 
regarding measures of outcome valence. First, 
prediction from outcome valence ratings ‘is 
based upon the assumption that all outcomes 
are relevant to the person who is rating the 
work outcomes. To the extent that the list of 
outcomes 1S not complete OT includes irrele- 
vant items fora particular person or sample 


research were of a very 8 
gful to ask a subject to rate the 


desirability of an outcome in gen- 
eral, since such à procedure assumes that all 
levels of an outcome are equally attractive. 
Therefore, the measurement of valence should 
be undertaken in relation to à specific level of 
an outcome, not the outcome in general. A 
final issue concerns the fact that many pre- 
vious studies have failed to consider negative 
outcomes (Hackman & Porter, 1968; Hill, 
Bass, & Rosen, 1970, are notable exceptions). 
By focusing on positive outcomes OF rewards, 
these studies have tested VIE theory only to 
a limited extent, since negative outcomes su 
as fatigue Or critidism may also have been 
important determiners of motivation. 

The present study utilized outcomes that 
were based on previous research as wel as 
extensive interviews with community leaders, 
management, immediate supervisors, and 
present and former production employees in 
the organizations being studied. An attempt 


valence or 
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was made to add specificity to the out- 
comes. Negative as well as Positive out- 
comes were included. Finally, valence was 
measured in terms of outcome desirability 
(anticipated satisfaction), rather than in terms 
of importance, 


Definition and M. easurement of 
I nstrumentality 


appen 
(1970), following the deci. 
Information-processin liter- 


tality as a Perceived 
he Perceived relation. 
linear one, H 
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be restricted to any particular form. The con- 
cept of instrumentality refers only to whether 
or not two outcomes are dependent. There- 
fore, as will be shown later, instrumentality 
should be defined in terms of subjective 
probabilities, 

Mitchell (1972) has argued that whether 
instrumentality is defined in terms of a per- 
ceived correlation or in terms of subjective 
probabilities depends upon whether one treats 
behavior and outcome dimensions as con- 
tinuous or discrete, However, treating out- 
come dimensions as continuous or discrete 
variables cannot have a bearing on this ques- 
tion, since we can talk of correlations for both 
continuous and discrete variables. The issue of 
continuous or discrete outcome dimensions 
does, of course, affect the operational defini- 
tion of the outcomes which subjects use in 
the instrumentality ratings. If one is dealing 
with continuous variables, as is typically the 
case, then arbitrarily restricting those vari- 
ables to one point (e.g., 
or describing a variable in such general terms 
that subjects 
alues (e.g., works espe- 
» Will not only severely restrict 
information about the nature of instrumental 
i but may also make a valid test 
of VIE theory impossible, 
© summarize, the inconsistencies in the 


nition of instrumentality and the resulting 
Proliferation of 


our understanding of the Motivation process 
Postulatd by VIE theory. Tt is important to 
recognize that VIE theory basically presents 
a decision-making or choice model of moti- 


at other possible 
for behaviors other than 
Production behavior, VIE theory seeks to 
person chooses to engage 
in one behavior tather than in other alter- 
native behaviors possible within the given 
circumstances, Research on VIE theory, so 
far, has not made this choice among alter- 


natives very Clear. The question then arises of 
h 


Ow to measure instrumentality, when in 


| 


T 


— 
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organizational settings We usually have to 
Work with continuous variables rather than 
with discrete variables as is typically done in 
the decision-making area. 

In the present study, instrumentality was 
conceptualized as a series of perceived con- 
ditional probabilities. Subjects were required 
to rate their perceived chances of attaining 
an outcome for a number of specific levels of 
performance. This permitted analysis of the 
manner in which perceived chances of out- 
come attainment may change over à range of 
performance levels. Thus, if one plotted the 
perceived chances of receiving an outcome, 
given each of a number of alternative levels of 
performance, instrumentality between per- 
formance and that outcome would essentially 
be defined by the resulting curve. Tt is impor- 
tant to note that this conception of instrumen- 
tality does not assume a linear relationship. 
Assumption of a linear relationship is im- 
plicit in studies that operationalized instru- 
mentality with a question in the general form 
of “To what extent does increased perfor- 
mance lead to a particular outcome?” The 
conceptualization used in the present research 
allows for possible nonlinear relationships be- 
tween different levels of performance and 
chances of outcome attainment. This ap- 
proach provides much more information 
concerning the concept of instrumentality and, 
therefore, about the utility of VIE theory. 

A consequence of the present conceptuali- 
zation of instrumentality is the fact that 
expectancies are also related to specific levels 
of performance. For reasons already dis- 
cussed, knowing employee perceptions of their 
chances of reaching several speci 
performance is preferred to a general effort- 

erformance statement. 

Finally, the attempt to restate VIE theory 
in terms of à decision process among alter- 
native behaviors (levels of performance) has 
clarified another issue of importance to VIE 
theory. Within the present conceptualization, 
the index of motivation (level of performance 
with maximum expected utility) is ipsative in 
nature because 4 given person’s maximum 
expected utility is assigned to a given perfor- 
mance level relative to that person’s expected 
utility for remaining alternative performance 


levels. Thus, the present motivation index 


represents that level of performance which, 
among a number of alternative performance 
levels, has the highest or maximum expected 
utility as perceived by a given person. This 
ipsative approach stands in contrast to the 
traditionally used normative motivation 
measures, which reflect how an individual’s 
expected utility (motivation) for a given per- 
formance level compares to other people’s 
expected utilities or motivation for that per- 
formance level. Within a choice or decision- 
making framework, the important question is 
not whether a subject has indicated a higher 
expected utility for working hard than have 
other subjects in the sample; the important 
question is whether a subject exerts effort 
toward that level of performance which for 
him has the highest expected utility relative 
to his perceived expected utilities of other 
possible levels of performance at which he 


could work. 


Integration of Task Goal into Valence- 
Instrumentality—Ex peclancy Theory 


A final issue to which the present research 
addressed itself is the investigation of the 
determinants of task goals within a VIE- 
theory framework. Although there is a grow- 
ing literature in a number of areas in psy- 
chology concerning the motivational prop- 
erties of goals and intentions (see Ryan, 
1970), in general, there have been relatively 
few studies that have investigated the task 
goal variable in organizational settings. Fur- 
thermore, studies on task goals have concen- 
trated on the regulatory effects of goals on 
task performance, so relatively little is known 
about what factors may affect whether à 
person sets goals and what particular task 
goals he may choose for his job. 

The VIE theory provides a suitable frame- 
work within which the process of task goal 
selection can be investigated. There have been 
a few attempts to integrate the concept of 
task goal with the components of VIE theory. 
Campbell et al. (1970) included the term 
task goal" in their “hybrid expectancy 
model." However, it is not clear whether their 
use of the term “task goal" was only designed 
to clarify the notion of first- and second-level 
outcomes (see Graen, 1969) or whether the 
concept of task goal in their model has moti- 
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vational properties over and above those spe- 
cified by the VIE-theory components, Arvey 
and Dunnette (1970) tested the Campbell et 
al. (1970) model in a laboratory setting and 
included a goal-setting question in their study. 
Their analysis dealt with whether or not a 
subject set a goal, not with what factors deter- 
mine what goal a subject chooses. 

The basic question implied by the present 
VIE model concerning task goals was whether 
they could be predicted by VIE-theory con- 
structs. The present VIE model also allowed 
a Comparison between the predictive power of 
the basic VIE model and that of stated task 
goals. Other important aspects of goal theory, 
such as the role of goal difficulty or the 
Process by which task goals mediate the 
effects of incentives or work outcomes (Locke, 
1968, 1970; Ryan, 1970), could not be han- 
dled directly within the present framework, 
Further theoretical as wel as empirical 
analyses remain to be done to understand 
these factors within the VIE-theory frame- 
Work. The integration of the growing liter- 
ature on goals and intentional behavior with 
VIE theory should provide a better under- 


standing of the Cognitive components of 
Work motivation, 


Hyporutsrs 


Construct validation 
assessment of the 


network of theoretical relationships (Cron- 
bach & Meehl, 1955 Dulany, 1968), Based 
upon the i 


tee ce ance with maximum 
expected utility is predicti 
performance and stated 


is 


y, 
oes not reflect the 
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4. The above relationships are stronger for 
more experienced employees, since they 
should have more accurate perceptions of per- 
formance expectations and performance-out- 
come contingencies. 


3. The ipsative measure of motivation 
(level of performance with maximum expected 
utility) is more strongly related to actual 
Work performance than the normative mea- 
Sure of motivation (the absolute amount of 
expected utility assigned to the level of per- 
formance which has the maximum expected 
utility). 

6. The correlations between performance 
and level of performance with maximum 
expected utility is stronger concurrent with 


and following data collection than for prior 
periods, 


7. Ability and level of performance with 
maximum expected utility has an interactive 
influence on performance. 


METHOD 
Setting 


The study was conducted in two organizations 
located in the eastern United States. One organization 
(henceforth called Plant 1) had approximately 450 
employees, the majority of whom were females 
working on sewing jobs. The plant was a union 
shop, with most production jobs on individual piece 
rates. The second organization (henceforth called 
Plant 2) had approximately 800 employees, the 
majority of whom were males working on various 
production jobs. Plant 2 was a union shop and had 


n0 incentive program, although production standards 
were associated with many jobs. 


Subjects 


The employees of interest for the present report 


were those who had Performance standards asso- 
ciated with their jobs. The employee samples of both 
Plants on which this Teport is based were defined by 
the following criteria: i 

ards, having Productivity data available, Hue 
employee’s name on both parts of a queso eA 
and having both Parts of the measure sume de 

*. Completeness was defined by whethe 


not the employee responded to 7396 of the Den E 
each of the Outcome desirability, expectancy, For 
Performance-outcome probability measures. 
Plant 1 the 


Present sample included 184 cm 
Tepresenting 41% of all employees; for Plant 2 i 


included 419 employees, representing 527 of all 
employees, 
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Procedure 


The measures used in the present study were in- 
corporated in à two-part questionnaire. The mea- 
sures were the same for both organizations (with the 
exception of anchors or definitions for the five levels 
of performance) and were administered in the same 
order. However, the method of administration was 
different in the two plants. 

In Plant 1, the two parts of the questionnaire were 
given on two separate days, one week apart. Sub- 
jects were called by department to a large room 
Where the purpose of the study was explained, 
instructions were given, example items Were com- 
pleted, and their questions were answered. Employees 
were then asked to take the measure home, complete 
it, and return it to a member of the research team 
who would be at the plant on 2 specified day. The 
same procedure was followed for the second part. of 
the questionnaire. One follow-up letter was sent to 
encourage employees to fill out the second part of 
the questionnaire. 

The procedure for Plant 2 was the same as for 
Plant 1, with two major exceptions. For Plant 2, the 
measures were completed on company time and no 
follow-up was required. 

Participation in poth plants was voluntary. Prior 
to the actual data collection sessions, 2 letter from 


plained the purpose of the study 1 
community and state i 


derstand the nature of people at 
nature of participation in the 
study was stressed both in the letter and at the 
beginning of each data collection session. Employees 
were requested to put their names on both parts of 
the questionnaire so that the two parts could be 


matched with criteria data. 


Measures 


The questionnaire used to asses 


cified by the model of 
developed 0n the basis of an extensive intei 


en in order to insure that honest and accu- 
ings Were received. These steps a5 
information can be obtained when the person 
filling out the questionnaire is given questions (a) 
he can honestly answer (b) about things he can 
observe Or experience, (c) in concrete terms, (d) in 
language he understands, and (e) on which he is sold 
as to the desirability and usefulness of honest and 
careful responses. ensure factual information, 
questionnaire items Were taken from interview tran- 
scripts and worded in the subject's language. Items 


were given specific referents such as “115% of stand- 


ard” rather than “good performance 
the questionnaire was designed in such a way that 
the subject would read each item as 2 complete 
sentence while selecting 2 response alternative, and 
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a considerable amount of time was spent in training 
respondents with examples of each type of question. 
All of the questions in the final questionnaire were 
pretested. 

Levels of performance were presented in terms of 
earnings per hour for Plant 1 and percentage O 
standard on the individual’s job for plant 2. The 
standards for both plants had been developed over à 
number of years through extensive industrial engi- 
neering studies. Five levels of performance Were used. 
The range of performance levels presented was differ- 
ent in the two plants due to different distributions of 
periormance against standards. For Plant 1 the levels 
of performance were presented in increments of 40¢ 
per hour. The increments for Plant 2 were 20% of 
the standard. 

Expectancy. For each level of performance, 
employees were asked to state, on a 5-point, verbally 
anchored scale, their perceived probability of being 
able to consistently reach that Jevel of performance 
without regard to how desirable that level of perfor- 


mance might be. 


Stated goals. Considering all the good and bad 


consequences a5 well as their chances of goal attain- 
ment, employees were asked to indicate their present 
work goal and their work goals for the next three 
months by selecting one of th 
levels. 

Direct rating of utility. Employees were required 
to give direct ratings of the desirability or utility of 
the various levels of performance, assuming they 
could reach any level they tried for. 

Work outcome desirability. Employees were asked 
to rate, on à 5-point, bipolar, verbally anchored 
scale, the desirability of each of 45 outcomes. The 
response alternatives ranged from very desirable to 
very undesirable, resulting in a 5-point scale scored 
+2 to —2. The outcomes used in the present research 
included items dealing with the subjects of pay, 
supervision, promotion, interpersonal relations, work- 
ing conditions, and work itself. 

Performance-outcome probabilities. This variable 
was assessed by having each employee rate, on a 5- 
point, verbally anchored scale, his perceived chances 
of getting an outcome, given that he was working at 
f the five levels of performance. 

oductivity- For Plant 1, productivity 
for the 6 


months prece i 
the study. The quarterly average reflects the em- 
ployees actual piece-rate earning plus any time the 
employee was paid his average rate. Additional pro- 
ductivity measures were the weekly earnings for each 
of 13 weeks during the quarter of this study. The 
weekly earnings reflect the employee’s actual piece- 
rate earning for each week. The primary productivity 
measure of interest for this study was the quarterly 
earnings figure during the period in which the study 
was conducted. 

Plant 2 employees had an assigned or primary job. 
Management was also free to place employees on jobs 
other than their primary jobs. Two efficiency ratings 
were available for Plant 2: the weekly efficiency 
index for each employee on his primary job and the 
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weekly efficiency index on all jobs he performed, 
regardless of whether or not it was his prime job. 
Efficiency was defined by comparing the hours 
required by standards set through industrial engi- 
neering studies (rated hours) against the actual 
hours worked (earned hours). Pe: 
for the five weeks preceding the Study, the two 
Weeks of the study, and the four weeks following the 
study were used. 1 


status, education, race, l 
Ability, 


Is Investigation are Presented elsewh 
Dachler & Mobley, 1971). 


Reliabili; y 


: Was defined as 
working under the Same supervisor. For F1 


employees under the Same supervisor the per. 
formance-outcome Probability matrix el 
transposed, interrater Correlations computed 
between all pairs of raters in that environ- 
ment, and the correlations averaged followin 
an 7 to z transformation. The average hie 
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rater correlation (weighted average over su- 
pervisor after r to z transformation) was .35 
for Plant 1 and .20 for Plant 2. 

There are several possible reasons for not 
finding higher interrater correlations. Some 
contingencies are individual specific, for 
example, "feeling of pride by working at X 
level." On these items, agreement would be 
high only if employees defined “feeling of 
pride" or other intrinsic outcomes in the same 
Tanner. Another possibility is that the 
operational definition of the “same environ- 
ment” was too gross. A more homogeneous 
grouping, perhaps including *same job," may 
lead to better rater agreement. However, until 
a better understanding of the determinants of 
instrumentality perceptions is achieved, and 
until further research specifies the general 
determinants and consequences of employee 
agreement in describing their work environ- 
ment (Schneider & Bartlett, 1970), the 
theoretical meaning of “reliability” for instru- 
mentality ratings remains unclear. In any 
case, since the primary motivation variable 
(level of performance with maximum expected 
utility) is ipsative in nature, the agreement 
among employees concerning perceptions of 
actual performance-outcome contingencies is 
less crucial than it would be for normative 
variables, 

Criterion reliability was assessed in several 
Ways. For Plant 1, the intercorrelation of the 
2 quarterly averages was 83. The average 
intercorrelation for the 13 weekly earnings 
criterion was .88. 

The productivity criteri 
proportions, 
assigned job 


a for Plant 2 were 
earned hours/rated hours on 
and earned hours/rated hours 
orked. The measures were col- 
weeks. The proportions were 
Subjected to an arc sine transformation 
($ —2 X me Sine \/p, Kirk, 1968). The 
average intercorrelation of the 11 assigned job 
efficiency indices was .56. The average inter- 
Correlation for efficiency on all jobs was .51. 

he ratio of total earned hours to total rated 
hours for the 11-week period was also com- 
Puted for both assigned job and all jobs 
worked. The correlation between these two 
measures was .87, 
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Validity of the Utility Index 


According to the VIE model outlined 
earlier, the utility of a given level of perfor- 
mance should be related to the sum of the 
products of the perceived chances that à given 
level of performance leads to various work 
outcomes and the desirability of these work 
outcomes. In particular, it was hypothesized 
that direct ratings of utility by employees 
would be highly related to utility a5 calculated 
by the formula specified by the VIE model. 
Further, this relationship should be stronger 
than the relationship between direct ratings 
of utility and expected utility as indexed by 
the product of expectancy and the computed 
index of utility. This is because utility is not 
expected to reflect the expectancy component. 

These hypotheses were tested by computing 
for each subject, across the five levels of per- 
formance, 4 within-subject correlation among 
each employee's computed index of utility, 
computed index of expected utility, and direct 
ratings of utility. The within-subject correla- 
tions were averaged (following r to z trans- 
formation). The results a strong 
average correlation between 
utility and the computed index of utility for 

z = 92). As predicted, 

higher than the average correla- 
ings of utility and the 

(F = .73). 


x of utility and 


direct ratings of utility was 68. The relation- 


between expected utility and direct 


ratings of utility was smaller (7 = Ji a 
within-subject correlations 


vectors. However, 
observed pattern of within-sub. 
tions is necessary; if not sufficient, for support 


of the hypothesis that direct ratings of utility 
strongly related to calculated 
expected utility. 

Its indicate that the utility of a 
ated to the extent 
level is associated 


These resu 
level of performance is rel 


to which that performance at 
with desirable outcomes: Further, the utility 
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of a level of performance is more closely 
related to the extent to which that level leads 
to desirable consequences than it is to the 
perceived chances that that performance level 
can in fact be achieved. Both findings support 
the concept of utility as specified by the 
present VIE model. / 


Expected Utility, Task Goals, and 
Perjormance 


The remaining hypotheses are concerned 
with the motivational determinants of task 
goals and performance. In accord with the 
basic VIE model, it is hypothesized that 
stated task goals are best predicted by the 
level of performance with maximum expected 
utility. Employees' stated task goals, in turn 
will be positively related to their actual ios 
duction on the job. Finally, a positive 
relationship between level of performance 
with maximum expected utility and actual 
performance on the job is expected. 

A final question concerning the interrela- 
tionships among expected utility, task goals, 
and períormance addresses itself to the re- 
sponsiveness of the VIE components to 
experiences in the work environment (Arrow 
B,D, M, and N in Figure 1). Employees of 
longer tenure should have better estimates of 


performance-outcome probabilities, better 
estimates of the job demands, and a better 
feel for how situational restraints and their 


own ability will influence their chances of 
reaching various levels of performance. There- 
fore, level of performance with maximum 
and task goals should be more 
to performance for long- 
nployees than for short-tenured 
employees. y 
that Jong-tenured employees will necessarily 
be better performers. Tt only suggests that ten- 
ure should moderate the relationships among 
expected utility, task goals, and performance. 
Results for Plant 1. For ease of summary 
and discussion, results concerning the 
hypotheses outlined above will be presented 
separately for each plant. The results for 
Plant 1 are presented in Tables 1 and 2. 
Table 1 reports the means and standard devi- 
ations for each of the variables of interest in 
this section. Table 2 presents the correlations 
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TABLE 1 


Prant 1 MEANS AND STANDARD DEVIATIONS FOR MOTIVATION AND PERFORMANCE 
VARIABLES, SUBDIVIDED BY TENURE 


More than 2 years 2 or less years 
Total sample tenure tenure 
Variable 

x SD n x SD n X SD n 

Level of performance with maximum 
expected utility 3.71 | 1.10 | 184 | 3.78 | 1.09 | 132 | 3.63 | 1.13 | 40 
Level of performance with maximum utility | 4.71 .76 | 184 | 4.71 -79 | 132 | 478 | .48 | 40 
Performance quarter of study 441 .56 | 181 | 4.30 Al | 132 | 3.60 .60 | 40 
Stated current goal 434 | .37 | 176 | 4.41 .35 | 132 | 4.11 .33 | 40 
Stated future goal 4.47 34 | 173 | 4.51 .33 | 129 | 4.35 34 40 


among these variables. These results are pre- 
sented separately for the total sample, for 
employees with more than two years of 
tenure, and for employees with two years or 
less of company tenure. 

As hypothesized, level of performance with 
maximum expected utility was significantly 
related to stated goals. Furthermore, level of 
performance with maximum expected utility 
was a better predictor of stated goals than was 
level of performance with maximum utility. 
This result lends support to the concept of 
expected utility by showing that the expec- 
tancy term is an important contributor to the 


prediction of task goal. Thus, the performance 
level a person indicates as his goal depends on 
how attractive or desirable that performance 
level is (utility) as well as on how likely he 
feels he is to actually be able to work at that 
level of performance (expectancy). 

The relationships discussed above are, in 
general, higher for the stated future goals 
than for stated current goals. If the assump- 
tion is correct that people continuously adjust 
their goals, expectations, and performance- 
outcome probabilities on the basis of expe- 
rience, one would expect that current beliefs 
about the consequences of different kinds of 


TABLE 2 


PLANT 1 CORRELATIONS AMONG EXPECTED Utitiry, Task Goats, 
AND PERFORMANCE, SUBDIVIDED BY TENURE 


Performance quarter 
Current goal Future goal of study 
Variable More | 2or More | 2or More | 2or 
han 2| less 
Total | than 2|. less than 2| less i|t 
sal years | years Total years | years Tota years | years 
tenure | tenure tenure | tenure tenure | tenure 
Level of performance wi " 
ith maxi- 
mum e; ih 
Xpected utility 27*| 30*| 99 35*| .27**| sp | .30**| .38*| .09 
Level of performance with maxi. | (079.1032) | (ao) |(73) | 29) | Go) | d8D |d32 | (4) 
mum utility 
13 21* 23 | 20*| 23*| 49| 04 .09 .01 
Current goal (176) | (132) | (40) (173) | 129) | (40) | (181) ns d 
.46** E . 
Future goal (173) pu r4 
ith s n 
(70) | (129) | (40) 
Note. Ns are in parentheses, 
*$ «.05. 
LP 
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production behaviors and about the chances 
of reaching different levels of performance 
would be reflected in the goals that people try 
to work for in the near future. As becomes 
evident from the correlations for long- and 
short-tenured employees, there exists a strong 
relationship between level of performance 
with maximum expected utility and stated 
future goals, particularly for short-tenured 
employees. 

The relationships among level of perfor- 
mance with maximum expected utility, level 
of performance with maximum utility, and 
stated goals were moderated by employee 
tenure. The correlation between level of per- 
formance with maximum expected utility and 


tenured employees than for sh 
employees. However, t 
level of performance with m 
utility and future goals was stronger for short- 
tenured employees than for long-tenured 
employees. moderation effects of em- 
ployee tenure on the relationship between 
level of performance with maximum utility 
and stated goals show a similar pattern; more 
future goal than for 


pronounced 
goal. The findings that short- 


stated current 
tenured employees' 
strongly related to both utility x 

utility (but they were not related to perfor- 
mance) may indicate that, perhaps, employees 
the company for a long time 


are much more mature when thinking about 


future goals, taking i 
put not specifically known, chang 
occur in an organization. On the other hand, 
new employees, not being aware of the extent 
of change that can occur, may be more likely 
to base their future goal only on the present 
assessment of the situation, An additional pos- 
sibility is that the scale for stated future goal 
did not go high enough, thus imposing an arti- 
ficial ceiling and reducing the correlations for 
older employees. 

Turning to the motivational determinants 
of performance, Table 2 shows that, in accord 
with the current model of motivation, level of 
aximum expected utility is 
significantly related to performance. Further- 
more, level of performance with maximum 
utility did not correlate as highly with pro- 


performance with m 
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duction behavior as did level of performance 
with maximum expected utility. Again, these 
results support the present VIE model, sug- 
gesting that performance is more predictable 
when both utility and expectancy are taken 
into consideration. Finally, in agreement with 
the assumption that experience affects the 
cognitions specified by the VIE model, the 
level of performance with maximum expected 
utility was related to production behavior for 
long-tenured employees but not for short- 
tenured employees. 

Earlier it was argued that level of perfor- 
mance with maximum expected utility is 
ipsative in nature. Since level of performance 
with maximum expected utility is assessed 
relative to each employee's expected utility 
indices for the remaining alternative perfor- 
mance levels, while the absolute amount of 
expected utility is measured relative to other 
employees' amounts of expected utility for 
each performance level, the former measure of 
motivation should be a better predictor of 
performance. The correlation. between the 
absolute amount of expected utility of the 
level of performance with maximum expected 
utility and performance was .14 (ns) com- 
pared to a correlation of 30 (p < .01) be- 
tween the level of performance with maximum 
expected utility and performance. 

Finally, we turn to the relationship between 
stated goals and performance. Both stated 
current and stated future goals were signifi- 
cantly related to performance. Again, em- 
ployee tenure moderated the relationship be- 
tween task goals and productivity. There was 
no relationship between stated goals and per- 
formance for new employees, while there was 
a significant relationship between stated goals 
and productivity for long-tenured employees. 

Another way of looking at the effects of 
goals on productivity is to investigate the 
relationships between employees’ stated goals 
and productivity measures at different points 
in time. Since task goals are viewed as indi- 
cating what a person intends to do with 
regard to performance (based on his moti- 
vation), one would expect that this expressed 
intention is related to current and future pro- 
ductivity but not necessarily to past produc- 
tion behavior, Results reflecting this argument 
are presented in Figure 2. This figure shows 


410 


Dota 
60 Cottection 
Weeks 10-11 


50 


40 


30 


20 


Stated Current Goal 


Stated Future Goal 


UM ATE RR GEIES AE NEC EE ES 
Week 


Fic. 2. Correlations between stated goals and perfor- 
mance for each of 13 weeks. 


the plot of the correlations between weekly 
earnings for each of 13 weeks and stated cur- 
Tent and future goals. It is evident that stated 
current goals had a higher correlation with 
each of the weekly production criteria than 
did future goals, Of special note is the fact 
that the correlation between stated current 
goals and performance was clearly highest for 
the second week of data collection and the 
following 2 weeks, that is, the week that 
stated goals were measured (r = .53) and the 
following 2 weeks, The pattern for stated 
future goals was not as clear. However, the 
Correlations for the week of and the weeks 
following data collection were among the high- 
est correlations, Thus, what the employees 
to be is indicative of 
roduce on the job, 

ent made with regard 
; t was hypothesized that 
ith maximum expected 
strongly related to per- 
ollowing the study than 


(quarterly average, 
production behavio 


T 6 months Prior to the 
study (quarterly 


months 
fference 
tions is Statistically 
significant (p < .01), When correlations be- 
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tween level of performance with maximum 
expected utility and performance were com- 


puted for each of 13 weeks surrounding the yA. 


study, no clear pattern emerged, although the 
correlation for the week of data collection 
turned out to be the highest (r — 8,9 < 
.05). Thus the VIE cognitions, although 
obviously not independent of past behavior, 
do seem to be best reflected in current and 
future behavior. 

Finally, the VIE model presented in this 
paper hypothesizes an interaction between 
motivation and ability. Unfortunately, an 
adequate test of this hypothesis was not pos- 
sible since ability test scores in Plant 1 were 
available only for short-term employees, This, 
of course, was the group for whom the 
expected utility index was hypothesized and 
shown to be an ineffective predictor of perfor- 
mance. For the 46 new employees for whom 
ability tests were available in Plant !, the 
correlation between ability and the quarterly 
average performance for the period of the 
study was .36 (p < .05). The correlation be- 
tween level of performance with maximum 
expected utility and quarterly average perfor- 
mance was .06 (ns) for the same group of 
employees. None of the correlations using the 
interaction term between motivation and 
ability improved the predictability of perfor- 
mance over the Predictability of performance 
by the ability measures Alone, 

Results for Plant 2. The Plant 2 means 
and standard deviations for the variables of 
interest in this section are presented in Table 
3. Table 4 reports the correlations among 
these variables for the total sample and for 
long- and short-tenured employees, separately, 
In order to get a more extreme separation in 
tenure, employees in Plant 2 were subdivided 
at one year of tenure, rather than two years of 
tenure as was done for Plant 1 employees. 
Subsequent analyses revealed few substantial 
differences when the Plant 2 sample was sub- 
divided at two years of tenure or when 
employees with five years or more tenure were 
compared with employees who had been in the 
company for only one year or less. 

Contrary to the results of Plant 1, the pre- 
dictions based on the work motivation mode] 
are not clearly supported by the data col- 
lected in Plant 2. First, stated current and 
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TABLE 3 


Prant 2 MEANS AND STANDARD DEVIATIONS FOR MOTIVATION AND PERFORMANCE 
VARIABLES, SuBDIVIDED BY TENURE 


Total sample More than 1 year 1 Year or less 
i tenure tenure 
Variable 
x SD n x SD 4 x| sD 5! 


SS SS E a 


——— ae 
Level of performance with maximum 
expected utility 3.48 | 1.23 | 412 1.21 | 246 | 348 | 1.31 96 


3.51 
Level of performance with maximum utility | 418 124 | 412 | 4-17 1.24 | 246 | 4.32 1.20 96 
Performance—all jobs 1.79 4s | 412 | 1.82 45 | 276 | 1.77 41 | 109 
Performance—prime job 1.86 ‘51 | 393 | 1.86 53 | 276 | 1.85 46 | 109 


2.19 47 | 366 | 2.23 15 | 246 | 2.12 2 96 
2.19 47 | 367 | 223 16 | 246 | 2.19 22 96 


Stated current goal 
Stated future goal 


future goals were equally related to either of difference between utility and expected 
level of performance with maximum expected utility. Furthermore; tenure did not moderate 
utility or level of performance with maximum the relationships among level of performance 
utility. This result does not support the notion with maximum expected utility, level of per- 


that a person's performance goal is based on formance with maximum utility, and stated 


both how attractive that performance level is goals. 
(utility) and how likely it is that a person Contrary to the results presented for Plant 


can actually achieve that level of performance l, all correlations involving production be- 
(expectancy). This is particularly true in view havior were exceedingly low, although sta- 
of the fact that there was 4 fairly strong tistically significant in some instances. A 
correlation (r = AA, p € .01) between level substantial contribution of level of perfor- 
of performance with maximum expected mance with maximum expected utility to per- 
utility and level of performance with maxi- formance was not demonstrated; nor did level 
mum utility, indicating that for employees In of performance with. maximum expected 
Plant 2 there may not have been a great deal utility predict productivity any better than 


TABLE 4 


Prant 2 CORRELATIONS AMONG EXPECTED Urry, Task GOALS, 
AND PERFORMANCE, SUBDIVIDED BY TENURE 


Performance 
Current goal Future goal 
All jobs Prime job 
Variable 
More | i y: More More M 
Yess h Dyer 1 year ore 

T: Total | han, | or less Total ï an | or less | Total than | or less | Total More | 1 year 
1 year | tenure year | tenure 1 year | ¢, iyear| of less 
tenure tenure tenure enure Hd tenure 


Level of performance with 
maximum expected 
utility 29+ | .30**| .32** 32% | .20**| .38** aA2* EU 20* 


(366) | (246) | 99 | (367 9 | 09 |. 
Level of performance with 2 Ge [049 (96) (12) | (276) | (109). | (393) (276) (109) 


maximum utility ied OF e .30** | .28* .3T* AME 10 22* 


3 E 38 
(6e) | Gas) | G9 | Ger | 249 | 69 (412) | @76) | (109) 9) care) | do» 


Current goal ae ask dis " m 1E 
366) | Q46 ( 3 À ( 
Future goal Bey, eaS | Co. SON (246) (96 


(367) | ae | (99 | G50 (346) | (99 


Note. Ns are in parentheses. 
sp cO D 


Dp cl. 
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did level of performance with maximum 
utility, as the VIE model requires. 

The hypothesis that the ipsative index of 
motivation should be a better predictor of 
performance than the normative index was 
likewise not supported with the data gathered 
in Plant 2. The correlation between the abso- 
lute amount of expected utility and perfor- 
mance was .18 (p < 01) compared to a 
somewhat smaller correlation or 2 (P < 
.05) between level of performance with maxi- 
mum expected utility and performance. 

Again, tenure did not seem to moderate 
the relationship between expected utility and 
production behavior. In fact, contrary to 
expectations, the correlations are somewhat 
higher for the short-tenured employees than 
far the long-tenured employees. These correla- 
tions, however, were so small that this differ- 
ence is probably not a very meaningíul one. 

The relationships between stated goals and 
actual productivity of Plant 2 employees were 
substantially smaller than those found for 
Plant 1 employees, although they were statis- 
tically significant in most instances. Further- 
more, the two tenure groups differed little in 
the relationship between stated goals and 
actual performance. Contrary to the results 
obtained for Plant 1 employees, the length of 
time a person had been employed by Plant 2 
had little effect on the extent to which actual 
productivity could be predicted by stated 
goals. 

Analyses of the temporal patterns of the 
correlations between stated goals and perfor- 
mance and between level of performance with 
Maximum expected utility and performance 
Se ge trend. In contrast to results 
en Ex 1 employees, the correlations 
dec o AR ed and actual production 
ing the week = RE during and follow- 
Hised Th eL tuu collection as was pre- 

s e finding was obtained for th 
correlations between ]; he 
evel of performance with 


maximum expected utilit ; 
As was the case for ae and production. 


: ant 1, it 
sible to test the hypothesizeq sme mo pos- 
vation interaction in y X Moti- 


Plant 2- ^ 
ability measures that were 2 fierent 
subsample of Plant 2 employees qp a 
essentially zero validities. Furthermore. Qd 

, e 


motivation index used in the present Study did 
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not predict performance very well for Plant 2 
employees. It is therefore not surprising that 


no support for the interaction hypothesis was 
found in Plant 2. 


Some Comparative Analyses for the 
Two Plants 


In order to investigate possible differences 
in the extent to which Plant 1 and Plant 2 
employees perceived contingencies between 
performance and the various outcomes, eta 
coefficients for the regression of perceived out- 
come probability on levels of performance 
were computed for each outcome. For 36 of 
the 45 work outcomes, the coefficient was 
higher for Plant 1 than for Plant 2. Thus, 
Plant 1 employees in most cases perceived a 
stronger performance-outcome 
than did Plant 2 employees. 

A second comparative analysis for the two 
organizations involved the ranking of the 45 
outcomes on mean outcome desirability in 
order to see whether the two employee 
samples differed in the pattern of desirable 
and undesirable outcomes. The two plants 
showed very similar patterns of outcome 
preferences. Monetary outcomes, accomplish- 
ment, pride, getting the most enjoyable job, 
and respect from the supervisor were among 
the most desirable outcomes. Boredom, ten- 
sion, being moved to a disliked job, and not 
having the necessary parts, tools, and 


materials were among the least desirable out- 
comes. 


contingency 


Finally, mean utility, mean expectancy, and 
mean expected utility as a function of level of 
performance were compared across the two 
organizations. The slope of the utility func- 
tion for Plant 1 was much steeper than the 
slope of the line for Plant 2. Plant 1 
employees seem to have perceived a much 
greater increase in utility from the lowest 
level to the highest level of performance than 
did Plant 2 employees. If the difference in 
slope of the performance-utility function 18 
not a result of differently perceived perfor 
mance level intervals for the two plants, it 
can be argued that differences in perceived 
consequences of working at different levels of 
performance were much more distinct for 
Plant 1 employees than for Plant 2 employees 


— ua 
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It should be noted that in choosing the per- 
formance level intervals, every effort was 
made to have equivalent performance inter- 
vals for the two plants. Since Plant 1 also had 
performance standards associated with most 
of the jobs, performance levels were initially 
expressed in terms of the same percentage of 
standard intervals as were used in Plant 2. 
Only then were the Plant 1 performance levels 
translated into earning equivalents based 
upon the incentive rate system, because Plant 
1 employees did not seem used to thinking 
about performance in terms of percentage of 
a standard. We are aware that this procedure 
did not necessarily guarantee equal perfor- 
mance intervals for the two plants. 

The expectancy-performance level function 
showed a very similar slope for both organi- 
zations. Although Plant 1 employees seemed 
to have had a slightly higher expectancy of 
reaching the two lowest levels of performance, 
employees in both plants agreed fairly well in 
their perceptions of decreasing chances of 
reaching increasing levels of performance. 

Figure 3 presents the plant comparison of 
mean expected utility as a function of level of 
performance. For both plants, mean expected 
utility increased linearly up to the third level 
of performance, exhibited a smaller increase 
to the fourth, and then decreased to the fifth 
level of performance. The terminal points 
and the rate of change over performance levels 
were less extreme for Plant 2. Again, assuming 
that the performance level intervals were per- 
ceived to be equal by the two employee 
samples, on the average, there seemed to be 
greater motivational differences for different 
levels of performance for Plant 1 employees 
than for Plant 2 employees. Tt made more of 
a difference for Plant 1 employees whether 
they performed at high or low levels of per- 
formance than was the case for Plant 2 
employees. 


DISCUSSION 


The Plant 1 results were supportive of the 
VIE model both in terms of statistical signifi- 
cance and in terms of the pattern of relation- 
ships. Stated goals were shown to be related 
to actual performance, thus supporting the 
Ronteriou of Locke (1968) and Ryan (1958, 

970) that goals and intentions regulate the 
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Fic. 3. Mean expected utility as a function of level 
of performance. 


expenditure of effort and thus are a primary 
motivational concept. Further, it was shown 
that stated goals are significantly related to 
level of performance with maximum expected 
utility. This indicates that expectancy, perfor- 
mance-outcome probabilities, and outcome 
desirability are variables relevant to the ques- 
tion of goal selection. The stated goal corre- 
lations might have been even higher had it 
not been for restriction of variance in stated 
goals. Few employees selected the extreme 
goals. 

The interrelationships prescribed by the 
model were supported by the findings that 
level of performance with maximum expected 
utility was significantly related to perfor- 
mance, was more strongly related to períor- 
mance than was level of performance with 
maximum utility, was more strongly related 
to performance than was the absolute amount 
of expected utility for the level of perfor- 
mance with maximum expected utility, and 
was more strongly related to performance 
during the period of the study than six months 
prior to the study. 

The fact that tenure was an important 
moderator of the Plant 1 results adds to the 
validation of the model. If expectations and 
performance-outcome contingencies are re- 
sponsive to experience, then it follows that 
longer tenured employees should have more 
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accurate perceptions of their chances of 
reaching various levels of performance and of 
performance-outcome contingencies. 

The Plant 2 results generally failed to sup- 
port the predictions of the model. Although 
stated goals were related to level of perfor- 
mance with maximum expected utility, they 
were not strongly related to performance nor 
was level of performance with maximum 
expected utility strongly related to perfor- 
mance. Furthermore, employee tenure did not 
moderate the relationships among expected 
utility, task goals, and performance. 

The differences in results for the two plants 
in themselves represent an important finding. 
It might be argued that differences in the 
characteristics of the two samples could ac- 
count for the differences in the results for 
Plant 1 and Plant 2. A comparison of the 
characteristics of Plant 1 employees with the 
characteristics of Plant 2 employees shows 
that the largest differences are on company 
tenure, job tenure, and sex. The fact that the 
employee sample of Plant 1 was nearly all 
female whereas the Plant 2 employee sample 
was mainly male does not seem to have a 
direct bearing on the predictions derived from 

the present model of motivation. There is no 
reason to expect males and females to show 
significantly different patterns of cognitions, 
They obviously may differ in the specific 
content but not in the general pattern of 
cognitions postulated by VIE theory. On the 
other hand, the fact that Plant 1 employees 
had an average company tenure of 11 years, 
compared to only 5 years of company tenure 
for the average Plant 2 employee, may have 
Prevented Plant 2 employees from basing 
their performance goals on a very accurate 


assessment of the organizational and personal 
conditions necessary to perform at those goals. 
Therefore, their 


E Stated goals and their 
expected utility may have been less likely to 


be reflected in their actual production 
behavior. However, even when Plant 2 


employees were divided into those with 5 
years or more tenure and those with 1 year 
or less tenure, the expected differences in the 
relationships among expected utility, task 
goal, and performance were not obtained. The 
absence of tenure effects on the predictability 
of employee performance in Plant 2 would 
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indicate that some factors other than 
employee tenure may have prevented the 
cognitions of even long-tenured employees in 
Plant 2 from being predictive of their choice 
of performance goals and exertion of effort 
toward some level of performance. 

Reflection on the nature of such limiting 
factors or boundary conditions (Graen, 1969) 
for VIE theory might result in at least two 
sets of conditions. One set includes such 
factors as lack of specified organizational 
contingencies between alternative behaviors 
and ensuing consequences and organizational 
structures and climates which induce per- 
ceived feelings of “helplessness” (or the in- 
ability to engage in desired behaviors or 
desired levels of performance). Some organi- 
zations may have certain characteristics (e.g., 
frequent changes in policies, lack of efficient 
channels of communication, frequent changes 
in job assignments, working for more than 
one supervisor) which either hinder the estab- 
lishment of clear contingencies or give 
employees the feeling that their work be- 
haviors are impaired by all sorts of con- 
straints. The existence of such conditions may 
result in employees accurately perceiving that 
different behaviors or performance goals are 
not instrumental in attaining desired work 
outcomes, or that there is no difference in the 
utility of various alternative performance 
levels, making the choice of performance goals 
an arbitrary one. Alternatively, such boun- 
dary conditions may lead employees to accu- 
rately perceive low probabilities of achieving 
high-utility performance levels. 

The second set of factors refers to personal 
or organizational conditions which make it 
difficult for employees to accurately perceive 
what the actual situation is like, what their 
chances of reaching certain levels of perfor- 
mance are, what the consequences of various 
alternative behaviors may be, and whether 
consequences, in fact, differ for different alter- 
native behaviors. While performance-out- 
come contingencies may in fact exist and 
achievement of various performance levels 
may not be appreciably “obstructed by 
environmental factors, there may nevertheless 
exist Conditions, such as employees’ lack of 
experience, personal characteristics related to 
unrealistic aspirations, inefficient information 
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flow in the organization, inadequate or lack of 
training, etc., which prevent employees from 
forming accurate perceptions about what the 
situation is actually like. 

Although the present results are obviously 
not sufficient to establish the existence and 
characteristics of such boundary conditions, 
they are consistent with such an explanation. 

The fact that the pattern of results 
obtained from Plant 1 employees supported 
every prediction derived from the present VIE 
model, whereas the pattern of results from 
Plant 2 employees in nearly all cases did not 
bear out these predictions, provides a strong 
argument for the possibility that certain 
organizational characteristics in Plant 2 acted 
as boundary conditions for the present VIE 
model. While the present study did not obtain 
precise measures of organizational conditions 
in the two participating companies, it may be 
possible to infer some characteristics of these 
boundary conditions from the more directly 
observable differences between the two plants. 

Since the piece-rate system used in Plant 1 
specified the contingencies between levels of 
performance and obtained salaries, the per- 
ceived relationship between different levels of 
performance and money-related outcomes 
should be more clearly defined for employees 
in Plant 1 than for employees in Plant 2, who 
were not exposed to any incentive plan. This 
was in fact the case. When the average 
chances of obtaining an outcome for different 
levels of performance were plotted across 
Plant 1 and Plant 2 employees for all the 
money outcomes (enough money to buy 
luxuries, less money than I deserve, enough 
money to buy essentials, enough money to 
provide for future expenses such as education 
of my children, retirement, etc.), for each 
money item the curve for Plant 1 employees 
was steeper than the curve for Plant 2 
employees, indicating a more pronounced per- 
ceived contingency between performance and 
money outcomes for Plant 1 employees (see 
Dachler & Mobley, 1971). 

There were a number of additional factors 
Which may have contributed to the fact that 
Perceptions of Plant 1 employees were more 
Predictive of their production behavior. In 
EL the jobs in Plant 1 seemed to be more 

ured, and employees were less likely to 
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be moved from one job to another. Further- 
more, Plant 1 had fewer organizational levels 
than Plant 2 and was smaller in terms of 
number of employees as well as number of 
different departments and product categories. 
These gross organizational differences, as 
observed by the present researchers, all favor 
the conclusion that the organizational condi-. 
tions of Plant 1 were more structured, less 
complex, and more “assessable” so that it was 
easier for employees to form accurate percep- 
tions about the organizational conditions and 
contingencies as well as to accurately perceive 
how much they could actually achieve in the 
existing situation. In addition, the fact that 
most employees in Plant 1 had gone through 
approximately one week of training before 
being put on the job, whereas Plant 2 
employees usually did not receive more than 
the customary orientation training before 
starting on the job, may well have enhanced 
the accuracy of Plant 1 employee perceptions 
and hindered the accuracy and realism of 
Plant 2 employee perceptions. These interpre- 
tations are consistent with the finding that 
36 out of 45 performance-outcome contin- 
gencies were stronger in Plant 1 and the find- 
ing that the mean expected utility function 
was stronger for Plant 1. 

Although the work outcomes presented to 
employees in this study were taken from inter- 
views conducted in both plants, it should be 
noted that the list of outcomes was by no 
means a complete one. Furthermore, it may 
well be possible that people take only a few 
important outcomes into consideration in 
making their task goal and performance level 
decisions. If rewards associated with working 
for Plant 2 were not the same as in Plant 1 
(even though the employees in the two 
samples seemed to have very similar patterns 
of desirable and undesirable outcomes) or if 
Plant 2 employees were more likely to focus, 
for example, on non-work-related outcomes 

(e.g., going fishing, having a good time with 
friends, etc.), which were not included in the 
productivity part of this study, then the low 
relationships found in Plant 2 might also have 
been due to the fact that the outcomes used 
were not as relevant to Plant 2 employees as 


they were to Plant 1 employees. 
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It is important to note that the explanations 
given above with regard to the boundary con- 
ditions are not meant to imply that Plant 2 
employees did not produce or were not moti- 
vated. If indeed they are correct, these expla- 
nations imply that if employees have beliefs 
that do not accurately reflect the current 
situation, their behaviors may also not 
correspond to what the situation allows or 
requires, In addition, the flatter the expected 
utility function is across levels of perfor- 
mance, the more arbitrary may be the choice 
of performance goals and the less effort and 
actual performance can be predicted by VIE- 
theory constructs. In the extreme, this may 
indicate that people who are placed into a 
completely random situation, in which no 
contingencies exist and in which little of any- 
thing is predictable, may behave in essentially 
random or trial and error fashion. 

Although the boundary condition interpre- 
tation appears to be consistent with the pres- 
ent results, the sex differences between the 
two samples and the procedural differences in 
collection of data could provide alternative 
explanations. Furthermore, th 
recall that the five levels of p 
the basis of which employees g 
ratings, were presented in 
earned per hour for Plant 1 

terms of percentage of a pr 
for Plant 2 employees, Thi 
definition of performance 
account for the differences i 
the two plants, Although th 
performance were carefully 
Checked before the actual c 
Performance in terms of 
our may have been muc 
and understandable tha 
terms of a company. cet pr 


e reader may 
erformance, on 
ave the various 
terms of money 
employees and in 
oduction standard 
s difference in the 
levels may also 
n the results for 
ese definitions of 
worked out and 
ollection of data, 
money earned per 
h more meaningful 
n performance in 
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functioning of certain employee thought 
processes, it is clear that the observed rela- 
tionships were far from perfect. Even under 
the best of circumstances, production behavior 
was only moderately predictable by the 
cognitive variables measured in this study. 
The present model of work motivation is still 
at an early stage of development and will 
require considerable modification before it 
can provide a more complete understanding of 
organizational behavior. Furthermore, the 
measures used to assess employee thought 
processes are still lacking in precision and are 
therefore in need of further refinement. 
Nevertheless, with the help of both the model 
of work motivation and the measures designed 
to assess the thought processes specified by 
the model, systematic and orderly results were 
observed, adding to the understanding of be- 
havior in organizations. 

The concept of boundary conditions, which 
provides an excellent post hoc explanation of 
the remarkable differences in the results from 
the two participating organizations, has some 
important theoretical as well as practical 
implications. Graen (1969) used the concept 
of boundary conditions for VIE theory as a 
means to explain the fact that he found sup- 
port for his VIE models only in those circum- 
stances in which manipulated contingencies 
between performance and certain work out- 
comes had been clearly established, Since he 
also found that the manipulation of contin- 
gencies was reflected in perceived instrumen- 
talities, he was able to argue that unless the 
organizational environment specifies contin- 
gencies between performance and valued or 
disvalued consequences and unless these con- 
Sequences are clearly perceived by the mem- 
bers of the organization, VIE theory cannot 
make predictions about performance. This 
argument is clarified if one puts VIE theory 
into a decision-making framework, as was 
done in the present research. If choice of task 
goals and effort expenditure toward achieve- 
ment of a specific level of performance is 
made, in part, on the basis of comparing the 
expected utility of the possible alternative 
levels of performance, as the results from 
Plant 1 would suggest, then a “rational” 
choice of a level of performance is difficult to 
make when no clear maximum expected utility 
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is perceived for any level of performance. In 
other words, if the perceived consequences and 
their likelihood of occurrence are similar 
across different performance levels or are 
indeterminant, then the assumption of max- 
imization of valued outcomes provides little 
basis upon which to make a decision. Pre- 
sumably some choice is made, since employees 
in Plant 2 did perform. However, the decision 
framework specified by VIE theory provided 
an inadequate basis for the understanding of 
production behavior in Plant 2. Future the- 
oretical and empirical work has to establish 
what motivational processes are at work when 
boundary conditions for VIE theory exist. 
Furthermore, it is important to investigate 
what personal and organizational character- 
istics constitute boundary conditions. 

Finally, if boundary conditions represent 
organizational and/or personal situations that 
encourage trial and error behavior due to an 
unpredictable environment, then VIE theory 
and its possible boundary conditions may 
have useful implications for enhancing orga- 
nizational effectiveness. The implications 
listed below are offered as hypotheses for 


future research: 


1. Contingencies between performance and 
desirable outcomes should be designed and 
explicitly communicated to employees. 

2. Training procedures for supervisors 
should emphasize (a) identification of em- 
ployee goals and desired outcomes and (5) 
Specification and consistent administration of 
contingencies between performance and both 
extrinsic and intrinsic outcomes. 

3. Selection systems should provide pro- 
Spective employees with accurate information 
on organizational conditions and contin- 
gencies; assess the extent to which outcomes 
available in the organization are the ones 
desired by the prospective employee; and pro- 
vide realistic and accurate information to the 
surrounding community concerning available 
rewards and conditions in order to facilitate 


self-selection. 
] constraints on performance 


4. Situationa r 
(e.g., machine downtime, lack of materials or 


information) should be mini 
5. Perceived personal con 


mized. 
straints on per- 
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formance (e.g. perceived lack of ability or 
lack of confidence) should be minimized. 

6. Orientation, communication, and em- 

ployee development procedures should be 
designed to facilitate accurate knowledge 
about f personal capabilities, organizational 
conditions, and contingencies. 
. By studying and evaluating these implica- 
tions and their impact on employee cognitions 
and behavior, it may be possible to investigate 
the interaction. between organizational “ens 
vironment, individual differences, and em- 
ployee cognitions. The VIE model presented 
in this paper provides a useful starting point 
for such an investigation. 
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